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Abstract 

Since  Ae  summer  of  1992,  the  Soar/IFOR  research  group  has  been  building  intelligent  automated  agents 
for  tactical  air  simylarinn,  The  iiirimatp.  goal  of  this  project  is  to  develt^  automated  pilots  whose  behavior 
in  engagements  is  indistinpfishahle  from  that  of  human  pilots.  This  technical  report  is  a 

collection  of  the  reseaidt  papers  diat  have  been  gmierated  horn  this  project  as  of  Spring  1994.  The 
leseaidi  coveted  in  these  papers  spans  a  wide  spectnim  of  issues  in  agent  development  such  as 
explanation,  managing  situational  awareness,  managing  multiple  interacting  goals,  coordination  between 
muMide  agents,  natural  language  processing,  developing  b^evable  agents,  event  traddng,  and  the 
infiastnicture  to  support  agent  development,  including  knowledge  acquisition  and  use,  interfacing  to 
simulation  envinnunoits,  and  develofring  low  cost  simulators. 
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Preface 


Since  the  summer  of  1992,  the  Soar/IFOR  research  group  has  been  building  intelligent  automated  agents 
for  tactical  air  simulation.  The  Soar/IFOR  research  project  exists  at  three  sites,  the  University  of 
Michigan,  the  University  of  Southern  California,  and  Carnegie  Mellon  University.  The  ultimate  goal  of 
this  project  is  to  develop  automated  pilots  whose  behavior  in  simulated  engagements  is  indistinguishable 
from  that  of  human  pilots.  Our  work  has  concentrated  on  developing  agents  for  beyond  visual  range 
engagements  where  there  are  either  one  or  two  fighter  planes  on  each  side. 

This  technical  report  is  a  collection  of  the  research  papers  that  have  been  generated  from  this  project  as 
of  Spring  1994.  Most  of  the  papers  were  presented  at  the  Fourth  Conference  on  Computer  Generated 
Forces  and  Behavioral  Representation  in  Orlando  in  May  1994.  The  others  include  our  paper  from  the 
Third  Conference  on  CGF  &  BR  and  two  papers  presented  at  other  workshops  and  conferences. 

The  research  covered  in  these  papers  spans  a  wide  spectrum  of  issues  in  agent  development  such  as 
explanation  [3,4],  managing  situational  awareness  [S],  managing  multiple  interacting  goals  [6], 
coordination  between  multiple  agents  [8],  natural  language  processing  [9],  developing  believable  agents 
[11],  and  event  tracking  [12] .  We  have  also  done  research  on  the  infrastructure  to  support  the 
development  of  these  agents  which  includes  work  on  knowledge  acquisition  and  use  [7],  interfacing 
agents  to  simulation  environments  [10],  and  developing  low  cost  simulators  [13].  The  papers  are 
organized  by  having  the  two  overview  papers  first  (the  one  presented  last  year  followed  by  the  one  to  be 
presented  this  year)  [1]  &  [2],  followed  by  all  of  the  other  papers  in  alphabetic  order  by  author. 
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Abstract 

'Draming  in  fight  simulators  will  be  more 
effective  if  the  agents  involved  in  the  simu¬ 
lation  behave  realistically.  Accomplishing 
this  requires  that  the  automated  agents  be 
under  autonomous,  intelligent  control.  We 
are  using  the  Soar  cognitive  architecture  to 
implement  intelligent  agents  that  behave 
as  much  like  humans  as  possible.  In  or¬ 
der  to  approximate  human  behavior,  the 
agents  must  integrate  planning  and  reac¬ 
tion  in  real  time,  adapt  to  new  and  un¬ 
expected  situations,  learn  with  experience, 
and  exhibit  the  cognitive  limitations  and 
strengths  of  humans.  This  paper  describes 
two  simple  tactical  Sight  scenarios  and  the 
knowledge  required  for  an  agent  to  com¬ 
plete  them.  In  addition,  the  paper  de¬ 
scribes  an  implemented  agent  model  that 
performs  in  limited  tactical  scenarios  on 
three  different  fight  simulators. 

The  goal  of  this  research  is  to  construct 
intelligent,  automated  agents  for  flight  sim¬ 
ulators  that  are  used  to  trsun  navy  pilots 
in  flight  tactics.  When  pilots  train  in  tacti¬ 
cal  simulations,  they  learn  to  react  to  (and 
reason  about)  the  behaviors  of  the  other 


agents  (friendly  and  enemy  forces)  in  the 
training  scenario.  Thus,  it  is  important 
that  these  agents  behave  as  realistically  as 
possible.  Standard  automated  and  semi- 
automated  agents  can  provide  this  to  a  lim¬ 
ited  extent,  but  trainees  can  quickly  rec¬ 
ognize  automated  agents  and  take  advan¬ 
tages  of  known  weaknesses  in  their  behav¬ 
ior.  To  provide  a  more  realistic  training  sit¬ 
uation,  automated  agents  should  be  indis¬ 
tinguishable  from  other  human  pilots  tak¬ 
ing  part  in  the  simulation. 

To  construct  such  intelligent,  automated 
agents,  we  have  applied  techniques  from 
the  fields  of  artificial  intelligence  and  cog¬ 
nitive  science.  The  agents  are  implemented 
within  the  Soar  system,  a  state-of-the-art, 
integrated  cognitive  architecture  (Rosen- 
bloom  et  al.,  1991).  These  agents  incor¬ 
porate  knowledge  gleaned  £rom  interviews 
with  experts  in  flight  tactics  and  analysis 
of  the  tactical  domain.  Soar  is  a  promising 
candidate  for  developing,  agents  that  be¬ 
have  like  humans.  Flexible  and  adaptive 
behavior  is  one  of  Soar’s  primary  strengths, 
and  Soar’s  learning  mechanism  provides  it 
with  the  capability  of  improving  its  perfor¬ 
mance  with  experience.  In  addition.  Soar 
allows  the  smooth  integration  of  planning 
and  reaction  in  decision  making  (Pearson 
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et  al,  1993).  Finally,  Soar  is  the  foundation 
for  the  development  of  a  proposed  unified 
theory  of  human  cognition  (Newell,  1990), 
and  thus  maps  quite  well  onto  a  number  of 
the  cognitive  issues  of  interest.  This  paper 
reports  the  results  of  our  research  in  con¬ 
structing  an  intelligent  agent  for  an  initial, 
simple  training  scenario  and  our  efforts  at 
supplementing  the  agent's  knowledge  in  or¬ 
der  to  carry  out  more  complex  missions. 

Complexities  of  tactical  decision- 
making 

In  order  to  complete  a  tactical  mission, 
pilots  incorporate  multiple  types  of  knowl¬ 
edge.  These  include,  for  example,  knowl¬ 
edge  about  the  goals  of  the  mission,  air¬ 
plane  and  weapon  constraints,  survival  tac¬ 
tics,  controlling  the  vehicle,  characteristics 
of  the  environment,  and  the  physical  and 
cognitive  capabilities  of  all  of  the  agents 
taking  part  in  the  scenario.  In  addition, 
pilots  use  thrir  knowledge  flexibly  and  ex¬ 
hibit  adaptive  behavior.  This  includes  a 
variety  of  capabilities,  such  as  reasoning 
about  (and  surviving  in)  unexpected  sit¬ 
uations,  adapting  to  new  situations,  learn¬ 
ing  from  experience,  and  addressing  multi¬ 
ple  goals  simultaneously  (e.g.,  protecting  a 
position,  intercepting  the  enemy,  and  sur¬ 
viving).  Finally,  pilots  integrate  decision¬ 
making  during  a  mission  with  split-second 
reactions  to  new  situations  and  potential 
threats. 

Robust  automated  forces  that  can  carry 
out  general  simulated  missions  must  ad¬ 
dress  these  issues,  especially  if  the  forces 
are  to  behave  as  humans  would  in  simi¬ 
lar  circumstances.  In  addition  to  provid¬ 
ing  the  wide  range  of  capabilities  that  hu¬ 
man  pilots  exhibit,  intelligent  agents  must 


reflect  the  same  types  of  weaknesses  as  hu¬ 
mans.  These  include  mental  limitations, 
such  as  attention  and  cognitive  load,  and 
physical  limitations,  such  as  reduced  cog¬ 
nitive  processing  under  high  forces  (such 
as  during  a  hard  turn). 

To  capture  the  complex  interactions  be¬ 
tween  agents  in  a  simulation,  we  feel  it  nec¬ 
essary  for  each  agent  to  be  as  autonomous 
and  intelligent  as  possible.  Simulation  via 
stochastic  methods  can  capture  general  be¬ 
haviors  of  groups  of  agents,  but  a  more  re¬ 
alistic  simulation  requires  each  agent  to  be¬ 
have  individually,  with  is  own  set  of  goals, 
constriunts,  and  perceptions.  In  addition, 
if  the  agents  are  to  be  used  for  training  pi¬ 
lots,  they  must  be  intelligent  in  order  to 
provide  as  rich  a  training  environment  as 
would  flying  against  real  humans. 

Requirements  for  an  intelligent  au¬ 
tomated  agent 

The  primary  research  question  is  how  in¬ 
telligent,  automated  agents  should  be  im¬ 
plemented.  A  simple  solution  would  be 
to  attempt  to  create  “simulation-pilot  ex¬ 
pert  systems”.  This  would  involve  con¬ 
verting  knowledge  about  high-level  tacti¬ 
cal  decision-making  into  a  fixed  rule  base. 
The  system  would  suggest  the  most  appro¬ 
priate  action  (or  set  of  actions)  based  on 
the  current  status  of  the  environment.  In 
fact,  a  number  of  expert  systems  have  been 
implemented  for  various  aspects  of  tactical 
•  dedsion-making  (e.g.,  Komell,  1987;  Rit¬ 
ter  &  Feurzeig,  1987;  Zytkow  &  Erickson, 
1987;  ). 

However,  while  expert  systems  have  some 
of  the  strengths  required  for  realistic  sim¬ 
ulation,  they  are  usually  weak  in  other  ar¬ 
eas.  In  a  standard  rule-based  approach,  it 
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is  diffictUt  to  capture  the  complexity  of  the 
multiple,  dynamic  goals  that  pilots  must 
reason  about.  In  contrast,  systems  that 
can  reason  well  in  such  a  complex  domun 
generally  have  difficulties  making  decisions 
in  real  time,  and  they  often  do  not  have 
the  ability  to  react  to  changes  in  the  en¬ 
vironment  when  there  is  not  enough  time 
to  plan  ahead.  In  addition,  systems  with 
only  high-level  tactical  knowledge  prove  to 
be  rather  ri^d.  Unless  the  system  can 
be  preprogrammed  for  every  possible  con¬ 
tingency,  its  performance  degrades  greatly 
when  it  finds  itself  in  unexpected  situa¬ 
tions.  Finally,  expert  systems  generally  ig¬ 
nore  the  possibility  of  learning  with  expe¬ 
rience  and  other  cognitive  aspects  of  the 
task.  Intelligent,  autonomous  agents  must 
combine  all  of  these  strengths,  having  the 
ability  to  reason  about  multiple  goals  in 
a  complex  environment,  react  quickly  and 
appropriately  when  the  time  for  complex 
reasoning  is  limited,  adapt  to  new  situa¬ 
tions  gracefully,  and  improve  its  beha\dor 
with  experience. 

In  order  to  create  an  agent  that  can  rea¬ 
son  and  react  in  real  time,  and  is  flexible 
enough  to  adapt  to  new  situations,  it  is  not 
enough  simply  to  encode  high-level  tactics 
as  rules  in  the  system.  Rather,  the  sys¬ 
tem  must  also  understand  why  each  high- 
level  tactical  decision  is  made,  so  it  must 
contain  knowledge  of  the  first  principles 
that  support  those  dedsions.  For  example, 
part  of  one  tactic  for  intercepting  a  bogey 
involves  achieving  a  desired  lateral  sepa¬ 
ration  from  the  bogey’s  flight  path.  One 
way  to  generate  this  behavior  is  to  include 
a  spedfic  nile  for  the  agent  to  move  to 
the  desired  lateral  separation  when  it  is  on 
the  appropriate  leg  of  the  intercept.  How¬ 
ever,  a  more  intelligent  agent  encodes  the 
knowledge  that  explains  why  this  partic¬ 


ular  tactic  works  (so  that  the  fighter  will 
have  enough  space  to  come  around  for  a 
rear-quarter  shot  if  the  long  and  medium- 
range  missiles  miss). 

With  the  appropriate  supporting  knowl¬ 
edge,  the  system  can  function  in  situations 
that  the  programmer  may  not  have  an¬ 
ticipated.  Maintaining  lateral  separation 
from  the  bogey’s  flight  path  is  a  general 
principle  that  allows  the  fighter  room  to 
negotiate  a  turn  for  a  short-range  missile 
shot.  This  principle  may  have  an  impact  in 
a  large  number  of  tactical  situations,  and 
therefore  shouldn’t  be  considered  as  merely 
an  instruction  to  follow  for  one  particu¬ 
lar  type  of  intercept.  If  the  system  rea¬ 
sons  from  first  prindples,  the  programmer 
does  not  have  to  hard  code  every  possible 
contingency,  and  good  variations  on  tactics 
should  emerge  in  response  to  unantidpated 
changes  in  the  simulation  environment. 

Implementing  the  agent  in  this  manner 
also  provides  advantages  in  terms  of  adding 
new  knowledge  to  the  system.  If  the  tacti¬ 
cal  dedsions  emerge  from  low-level  l^owl- 
edge,  high-level  decisions  will  change  ap¬ 
propriately  as  the  supporting  knowledge  is 
changed  or  supplemented.  New  low-level 
knowledge  (such  as  a  better  understand¬ 
ing  of  geometric  principles  or  radar  limita¬ 
tions)  will  interact  with  existing  knowledge 
to  generate  subtle  (or  possibly  dramatic) 
changes  in  behavior.  Thus,  the  agent  can 
reason  in  a  number  of  new  situations  with¬ 
out  requiring  a  new  spedfic  rule  for  each 
case.  The  ease  of  adding  new  knowledge 
to  the  system  also  makes  it  possible  to  in¬ 
corporate  existing  machine-learning  mech¬ 
anisms.  These  can  allow  the  system  to 
adapt  and  improve  its  behavior  vdth  ex¬ 
perience,  as  well  as  provide  insights  into 
how  human  pilots  learn  about  tactics. 


3 


The  Soar  architecture  for  problem  solving 
(Newell,  1990)  is  well  suited  for  this  type 
of  task.  It  divides  knowledge  into  prob¬ 
lem  spaces  and  allows  goals  and  actions 
in  one  problem  space  to  be  implemented 
via  reasoning  in  another.  Thus,  when  the 
agent  has  a  high  level  goal  to  intercept  a 
bogey,  for  example,  it  can  switch  problem 
spaces  and  reason  about  the  characteristics 
of  its  weapons,  radar,  airplane,  and  mili¬ 
tary  doctrine.  The  knowledge  from  each 
of  these  spaces  combines  to  generate  an 
appropriate  tactical  action.  In  turn,  the 
high-level  action  can  then  be  implemented 
in  a  problem  space  that  contains  meditun- 
level  knowledge  about  plane  maneuvers  or 
low-level  knowledge  about  moving  the  stick 
and  flipping  switches. 

Because  knowledge  is  separated  into  prob¬ 
lem  spaces,  it  can  be  easily  updated.  For 
example,  if  the  agent’s  plane  is  equipped 
with  a  new  radar  with  a  longer  range,  only 
the  knowledge  in  the  *‘radar”  space  need 
be  updated.  New  decisions  made  in  the 
radar  space  will  interact  with  the  results 
of  reasoning  in  other  problem  spaces,  even¬ 
tually  impacting  high-level  decisions  such 
as  which  specific  actions  should  be  taken 
to  intercept  a  bogey.  Likewise,  if  the  au¬ 
tomated  agent  is  moved  to  a  new  simula¬ 
tion  environment  with  a  new  interface,  we 
can  appropriately  update  the  k^.owledge  in 
the  "control”  problem  space,  leaving  the 
remaining  knowledge  intact. 

Simple  tactical  situations 

Our  initial  effort  to  construct  an  intelli¬ 
gent  agent  focuses  on  two  tactical  scenar¬ 
ios  used  in  training  pilots:  the  "non-jinking 
bogey”  and  "1-v-l  aggressive  bogey”  sce¬ 
narios.  In  the  non-jinking  bogey  scenario, 
the  target  is  an  airplane  (such  as  a  cargo 


or  fuel  plane)  that  holds  a  steady  course 
and  altitude,  and  does  not  carry  any  of¬ 
fensive  threats.  The  key  to  this  scenario  is 
that  the  bogey  does  not  attempt  to  evade 
(jink)  the  fighter’s  attack  in  any  way.  Al¬ 
though  this  situation  is  not  likely  to  occur 
often  in  real  combat  situations,  it  is  a  valu¬ 
able  training  situation  for  pilots.  It  teaches 
them  how  to  line  up  the  delivery  of  various 
types  of  missiles  when  the  bogey’s  behavior 
is  very  predictable.  When  a  non-offensive 
bogey’s  behavior  becomes  less  predictable, 
the  tactics  required  to  intercept  it  actually 
become  simpler  (but  less  effective). 

There  are  three  mun  phases  involved  in 
attacking  a  non-jinking  bogey  (see  Figure 
1).  These  involve  delivering  long,  medium, 
and  short-range  missiles.  During  each  of 
the  phases,  the  fighter  must  assume  that 
the  current  missile  will  miss,  and  simul¬ 
taneously  maneuver  into  the  most  advan¬ 
tageous  position  for  the  next  phase.  For 
example,  while  moving  closer  to  the  bogey 
to  fire  a  long-range  missile,  the  fighter  also 
attempts  to  achieve  the  best  lateral  sep¬ 
aration  and  target  aspect  for  a  shot  with 
the  medium-range  missile  (see  Figure  2). 
After  delivering  a  medium-range  missile, 
the  fighter  must  perform  displacement  and 
counter  turns  in  order  to  end  up  behind 
the  bogey.  This  allows  the  fighter  to  fire 
a  rear-quarter  short-range  missile.  Due  to 
these  constraints,  the  fighter  cannot  simply 
head  on  a  collision  course  with  the  bogey, 
but  must  get  to  the  bogey  as  quickly  as 
possible  while  ensuring  that  it  can  eventu¬ 
ally  achieve  a  rear-quarter  missile  shot. 

The  tactics  for  executing  this  scenario  are 
relatively  simple.  The  fighter  must  achieve 
the  appropriate  lateral  separation  and  tar¬ 
get  aspect  while  firing  its  weapons  at  the 
right  times.  Then  it  must  execute  the  dis- 
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Figure  1.  Three  stages  for  intercepting  a  non-jinking  bogey. 
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placement  and  counter  turns  and  deliver 
the  short-range  missile.  As  mentioned  pre¬ 
viously,  we  could  code  these  tactics  directly 
into  rules  for  the  agent,  but  they  would 
then  only  work  under  very  specific  circum¬ 
stances  where  everything  goes  right.  Thus, 
we  have  implemented  the  knowledge  that 
supports  these  tactics.  This  knowledge  jus¬ 
tifies  why  each  tactical  decision  should  be 
made  when  it  is  made.  This  allows  the  sys¬ 
tem,  for  example,  to  get  back  on  course  for 
a  short-range  missile  shot  if  it  misses  its ' 
opporttmity  for  the  medium-range  missile 
shot  for  some  reason.  In  addition,  any  par- 
ticiilar  action  that  the  agent  generates  will 
be  based  on  the  supporting  knowledge,  and 
the  agent  has  the  potential  to  explain  its 
decision  (a  facility  we  plan  to  add  in  the 
future). 

The  1-v-I  aggressive  bogey  scenario  in¬ 
volves  two  airplanes  with  similar  capabil¬ 
ities.  One  is  protecting  a  high-value  unit 
and  the  other  is  attempting  to  destroy  it. 
When  the  two  fighters  come  in  contact  they 
both  attempt  to  intercept  and  destroy  each 
other,  with  the  overall  goal  of  surviving. 
This  scenario  highlights  an  interaction  be¬ 
tween  different  low-level  constraints  that 
results  in  tactical  decisions.  For  example, 
if  one  fighter  is  equipped  wdth  a  slightly 
better  radar,  missiles  with  longer  range, 
or  a  more  mobile  airplane,  it  dramatically 
affects  the  actions  that  should  be  taken 
in  completing  the  mission  and  surviving. 
Our  agent  so  far  only  partially  implements 
this  1-v-l  scenario,  and  it  involves  a  num¬ 
ber  of  issues  that  make  it  more  complex 
than  the  non-jinking  bogey  scenario.  Af¬ 
ter  discussing  the  current  state  of  the  agent 
model,  we  will  describe  these  issues  in  de¬ 
tail. 


Details  of  the  intelligent  agent 

In  order  to  construct  an  agent  that  suc¬ 
cessfully  intercepts  a  non-jinking  bogey,  we 
analyzed  tactics  for  the  scenario  and  inter¬ 
viewed  former  pilots  and  radar  intercept 
ofiicers.  This  allowed  us  to  determine  the 
underlying  knowledge  and  first  principles 
that  support  the  tactics.  Then,  we  en¬ 
coded  this  knowledge  into  an  executable 
Soar  system. 

The  Soar  agent’s  knowledge  is  organized 
into  problem  spaces,  each  containing  oper¬ 
ators  that  allow  the  agent  to  reason  about 
particular  types  of  goals.  When  the  agent 
cannot  immediately  carry  out  an  action  at 
one  level,  it  uses  Soar’s  universal  subgoal- 
ing  mechanism  to  move  into  an  aUemate 
problem  space  and  consider  methods  for 
carrying  out  that  action.  Therefore,  high- 
level  tactical  decisions  are  eventually  im¬ 
plemented  as  medixun-level  maneuver  ac¬ 
tions  or  low-level  control  actions,  and  the 
agent  always  has  multiple  goals  in  memory 
that  it  uses  to  reason  about  and  react  to 
its  ever-changing  situation. 

Depending  on  the  particular  simulation 
platform,  the  current  Soar  agent  requires 
between  13  and  17  problem  spaces  to  rea¬ 
son  with;  i.e.,  13-17  different  types  of  goab 
that  it  reasons  about.  Most  of  these  are 
shown  in  Figure  3.  The  mission,  protect- 
hvu,  barcap,  and  intercept  problem  spaces 
encode  tactical  knowledge  for  carrying  out 
missions  and  performing  intercepts.  The 
problem  spaces  for  weapons  and  missiles 
include  knowledge  about  specific  weapons 
and  the  actions  that  must  be  performed 
to  deliver  them  to  a  target.  The  maneu¬ 
ver  and  absolutes  problem  spaces  deter¬ 
mine  the  actual  plane  maneuvers  that  must 
be  carried  out  to  implement  higher-level 
actions.  The  remaining  problem  spaces  im- 
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plement  airplane  maneuvers  at  various  lev¬ 
els  of  specification,  down  to  the  level  of 
stick  and  button  commands  that  are  issued 
to  the  flight  simulator. 

At  any  particular  instant,  between  5  and 
12  problem  spaces  (or  hierarchical  goals) 
are  usually  active.  Thus,  when  changes  oc¬ 
cur  in  the  agent’s  situation,  there  are  mul¬ 
tiple  levels  at  which  the  agent  may  react 
(Pearson  et  al.,  1993).  For  example,  at  a 
low  level,  a  sudden  down  draft  can  cause 
a  change  in  dimb-rate  or  altitude,  lead¬ 
ing  the  agent  directly  to  pull  back  on  the 
stidc.  At  a  higher  levd,  a  maneuver  by  a 
bogey  on  the  radar  can  cause  a  change  in 
tactics.  Any  reasoning  involved  in  imple¬ 
menting  the  new  tactical  dedsion  also  per¬ 
colates  down  to  a  new  maneuver  or  stick 
action.  In  this  manner.  Soar  maintains  its 
variety  of  goals  in  paralld,  and  violations 
of  the  goals  at  any  levd  lead  to  immediate 
action  at  the  appropriate  level. 

We  have  implemented  an  initial  model  for 
the  non-jinking  bogey  scenario  in  whole  or 
in  part  on  three  separate  flight  simulators. 
The  simplest  simulator  moves  planes  in  a 
two-dimensional  grid-world.  In  addition, 
the  planes  do  not  move  with  realistic  flight 
dynamics.  We  used  this  simulator  to  pro¬ 
totype  the  system  and  debug  the  high-level 
tactics  embedded  in  the  system.  The  sec¬ 
ond  flight  simulator  was  adapted  from  the 
flight  simulator  provided  with  SGI  graph¬ 
ics  workstations.  It  works  in  real  time  and 
requires  the  agent  to  issue  very  low  level 
commands  at  the  levd  of  moving  the  stick 
(by  issuing  mouse  pixel  movements)  and 
other  low-level  conunands  (by  simulating 
keyboard  presses).  The  non-jinking  bogey 
scenario  has  not  yet  been  completely  im¬ 
plemented  on  this  simulator,  because  Soar 
must  handle  the  low  level  intricacies  of  sim¬ 


ply  flying  the  airplane  as  well  as  worrying 
about  tactical  decisions  and  maneuvering. 
Finally,  we  have  implemented  the  scenario 
on  BBN’s  ModSAF  simulator,  which  has 
the  most  realistic  flight  dynamics  of  the 
three  simulators.  This  simulator  works  in 
real  time  (with  a  scheduler  dividing  time 
between  the  simulation  and  agents)  and  it 
takes  commands  at  the  level  of  maneuver 
actions  (such  as  desired  heading  and  alti¬ 
tude)  without  making  the  agent  concern 
itself  with  how  the  maneuvers  are  actually 
implemented  with  airplane  controls. 

As  of  now,  we  have  not  completely  de¬ 
veloped  the  knowledge  base  that  would  al¬ 
low  our  agent  to  successfully  fly  the  1-v- 
1  aggressive  bogey  scenario.  This  scenario 
differs  from  the  non-jinking  bogey  scenario 
along  two  major  dimensions.  First,  the  bo¬ 
gey  maneuvers,  so  its  behavior  is  not  en¬ 
tirely  predictable.  Second,  the  bogey  is  ag¬ 
gressive  and  has  offensive  capabilities,  so 
any  action  that  is  taken  must  also  address 
the  overall  goal  of  surviving:  the  agent  can¬ 
not  simply  close  in  on  the  bogey  and  shoot 
it. 

In  order  to  successfuUy  complete  a  mis¬ 
sion  against  an  aggressive  bogey,  the  agent 
must  include  not  only  extra  knowledge  in 
its  tactical  problem  spaces,  but  it  must  also 
have  two  new  capabilities  to  address  the 
above  issues.  First,  the  agent  must  be  able 
to  interpret  and  assess  its  current  situation 
at  all  (or  at  least  most)  times.  This  primar¬ 
ily  involves  interpreting  the  bogey’s  cur¬ 
rent  actions  and  predicting  its  future  ac¬ 
tions.  As  with  most  of  the  agent’s  reason¬ 
ing,  the  interpretation  process  also  takes 
place  at  multiple  levels.  At  a  low  level, 
the  fighter  must  recognize  when  the  bogey 
hs4  initiated  a  turn  and  when  it  has  com¬ 
pleted  one.  At  a  higher  level,  the  fighter 
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miist  determine  whether  the  turn  indicates 
some  kind  of  threat,  and  what  that  threat 
may  be.  For  example,  if  the  bogey  initially 
comes  to  a  collision  course  with  the  fighter, 
this  probably  indicates  that  the  bog^  is 
aggressive  and  is  going  to  try  to  shoot  the 
fighter.  If  the  bogey  points  towards  the 
fighter  and  then  makes  a  hard  turn,  this 
indicates  that  the  bogey  has  probably  just 
fired  a  missile.  The  agent  must  interpret 
the  limited  information  it  gets  from  its  sen¬ 
sors.  Then  it  must  use  this  interpretation 
to  predict  the  goals  that  the  bogey  is  try¬ 
ing  to  achieve  and  the  actions  at  different 
leveb  that  the  bogey  is  carrying  out. 

The  second  necessary  capability  for  the 
agent  is  to  use  multiple  high-level  goals  to 
constrsdn  the  actions  that  the  agent  gener¬ 
ates.  These  types  of  goals  are  a  bit  differ¬ 
ent  from  the  parallel  goals  that  the  Soar 
agent  already  handles,  because  they  are 
not  hierarchical  in  nature.  Rather,  they 
are  distinct  goals  that  interact  with  each 
other.  For  example  one  goal,  destroy  bo- 
gey^  implies  that  the  fighter  should  close  in 
on  the  bogey  as  quickly  as  possible.  How¬ 
ever,  another  goal,  survive,  pressures  the 
fighter  to  avoid  the  bogey  in  order  to  stay 
out  of  the  bogey’s  weapon  range.  These 
conflicting  goals  both  must  be  used  to  se¬ 
lect  from  multiple  possible  actions.  This 
type  of  reasoning  leads  directly  to  com¬ 
posite  tactical  actions.  For  example,  the 
fighter  may  get  close  enough  to  fire  a  mis¬ 
sile  and  then  make  a  sudden  hard  turn. 
The  turn  must  be  hard  enough  to  keep  the 
bogey  and  fighter  from  getting  close  too 
qiuddy,  but  not  so  hard  that  the  fighter 
loses  its  radar  lock  on  the  bogey  (which 
would  put  the  fighter  at  a  large  disadvan¬ 
tage).  In  this  manner,  the  agent  deter¬ 
mines  the  best  action  that  supports  two 
simultaneous,  conflicting  goals. 


The  issues  of  interpretation  and  simulta¬ 
neous  goals  are  not  trivial,  and  they  play 
central  roles  in  agent  reasoning  for  any  tac¬ 
tical  situations  except  the  simplest  ones. 
Much  of  tactical  decision  making  involves 
creating  a  model  of  the  world  from  lim¬ 
ited  information  and  addressing  multiple 
goals  and  constraints,  such  as  the  current 
mission,  survival,  and  the  characteristics 
and  status  of  the  weapons  and  airplane. 
We  have  not  completed  the  incorporation 
of  this  knowledge  into  the  agent  yet,  but 
we  are  taking  advantage  of  the  strengths 
of  the  Soar  architecture  in  order  to  im¬ 
plement  these  two  important  capabilities 
(Covrigaru,  1992). 

Discussion 

We  have  implemented  an  intelligent,  au¬ 
tonomous  agent  that  completes  missions  in 
a  simple  tactical  scenario.  The  agent  is 
designed  with  flexibility  in  mind.  It  rea¬ 
sons  from  first  principles  about  high-level 
tactical  decisions,  and  is  thus  able  to  rea¬ 
son  in  unexpected  situations  and  recover 
gracefully  from  mistakes.  In  addition,  the 
agent’s  knowledge  base  is  flexible  enough 
to  be  easily  transferred  between  simulation 
platforms  and  to  encode  new  tactics  in  a 
modular  fashion.  We  are  currently  imple¬ 
menting  the  knowledge  necessary  for  the 
agent  to  complete  the  1-v-l  aggressive  bo¬ 
gey  scenario.  This  includes  addressing  the 
two  important  issues  of  situation  interpre¬ 
tation  and  achieving  multiple  simultaneous 
and  interacting  goals. 

Our  future  research  will  involve  incremen¬ 
tally  expanding  the  agent’s  knowledge  base 
so  it  can  reason  robustly  in  a  wide  range 
of  1-v-l  scenarios.  We  will  also  soon  fo¬ 
cus  on  modeling  more  complex  scenarios, 
including  those  involving  more  than  two 
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planes.  This  will  also  allow  us  to  expand 
the  agent's  coverage  of  the  cognitive  be¬ 
haviors  involved  in  tactical  flight.  For  ex¬ 
ample,  we  will  incorporate  more  intelligent 
methods  for  situation  assessment,  model¬ 
ing  other  agents  (i.e.,  robustly  predicting 
actions  and  goals  of  other  partidpants  in 
the  scenario,  both  fri^ds  and  foes),  iden¬ 
tifying  potential  threats,  and  reacting  to 
them.  Beyond  that,  we  will  focus  on  more 
complex  cognitive  tasks,  such  as  more  com¬ 
plete  integration  of  planning,  reaction,  and 
execution,  more  sophisticated  interpreta¬ 
tion  of  the  environment  and  other  agents, 
and  learning  from  instruction. 
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Abstract 

Ttus  article  reports  on  recent  progress 
in  the  development  of  TacAir-Soar,  an 
intelligent  automated  agent  for  tactical  air 
simulation.  This  includes  progress  in 
expanding  the  agenTs  coverage  of  the 
tactical  air  domain,  progress  in  enhancing 
the  quality  of  the  agent's  behavior,  and 
progress  in  building  an  infrastructure  for 
research  and  development  in  this  area. 

Introduction 

At  the  Third  Conference  on  Computer 
Generated  Forces  and  Behavioral 
Representation  we  presented  an  initial  report 
on  an  effort  to  build  intelligent  automated 
agents  for  tactical  air  simulation  (Jones  et  al, 
1993).  The  ultimate  intent  behind  this  effort 
is  to  develop  automated  pilots  whose 
behavior  in  simulated  battlefields  is  nearly 
indistinguishable  from  that  of  human  pilots 
(and  to  go  beyond  this  to  develop  generic 
agents  that  are  readily  specializable  for  this 
and  other  domains).  If  such  agents  can  be 
created,  they  should  provide  close  to  ideal 
force  supplements  for  many  of  the 
applications  anticipated  for  distributed 
interactive  battlefield  simulation. 

As  of  the  initial  report,  prototype  agents 
had  been  constructed  that  could,  in  real¬ 


time,  flexibly  use  a  small  amount  of  tactical 
knowledge  about  two  classes  of  one-versus- 
one  (1-v-l)  Beyond  Visual  Range  (BVR) 
tactical  air  scenarios.  In  the  non-jinking 
bogey  scenarios,  one  plane  (the  non-jinking 
bogey)  is  unarmed  and  maintains  a  straight- 
and-level  flight  path.  The  other  plane  is 
armed  with  long-range  radar-guided, 
medium-range  radar-guided,  and  short-range 
infrared-guided  missiles.  Its  task  is  to  set  up 
for  a  sequence  of  missile  shots,  at 
increasingly  shorter  ranges,  until  the  non¬ 
jinking  bogey  is  destroyed.  Though  such 
scenarios  are  not  conunon  in  the  real  world, 
they  are  used  as  training  exercises  because 
they  teach  pilots  how  to  position  their  planes 
for  later  shots  while  simultaneously  taking 
earlier  ones.  In  the  aggressive  bogey 
scenarios,  one  plane  is  attempting  to  protect 
a  High-Value  Unit  (HVU),  such  as  an 
aircraft  carrier,  via  a  Barrier  Combat  Air 
Patrol  (BARCAP);  that  is,  the  plane  patrols 
between  the  HVU  and  the  anticipated  threat 
(by  cycling  around  a  racetrack  pattern),  and 
then  intercepts  any  threat  that  it  detects  in  its 
sector.  The  other  plane  is  attenq>ting  to 
attack  the  HVU,  but  to  do  so  it  must  first 
intercept  the  defensive  aircraft. 

The  prototype  agents  were  all 
implemented  as  parameterized  variations  of 
a  single  multi-functional  tactical-air  agent. 
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called  TacAir-Soar  (or  TAS  for  short).  TAS 
is  built  within  Soar,  a  software  architecture 
that  is  being  developed  as  a  basis  for  both  an 
integrated  intelligent  system  and  a  unified 
theory  of  human  cognition  (Rosenbloom. 
Laird,  &  Newell,  1993;  Newell,  1990).  Soar 
provides  TAS  with  basic  support  for 
knowledge  representation,  problem  solving, 
reactivity,  external  interaction,  and  learning. 
Soar  also  provides  a  potential  means  of 
integrating  into  TAS  additional  planning, 
learning,  and  natural  language  capabilities 
that  are  being  developed  independently 
within  Soar. 

The  prototype  TAS  agents  actually 
utilized  only  a  subset  of  the  capabilities 
provided  either  directly  by  Soar,  or  built 
separately  within  it  However,  this  subset  - 
along  with  the  domain-specific  (and 
domain-independent)  rules  that  were  added 
to  Soar’s  long-term  memory  —  was 
sufficient  to  yield  a  combination  of 
knowledge-based  decision  making, 
task(/goal)  switching  and  decomposition, 
and  real-time  interaction  with  the  DIS 
enviromnent  Knowledge-based  decision¬ 
making  arises  from  Soar’s  ability  to  make 
decisions  based  on  integrating  preferences 
generated  by  arbitrary  sets  of  rules.  Task 
switching  also  arises  from  Soar’s  decision¬ 
making  abilities,  but  here  as  specifically 
applied  to  the  selection  and  switching  of 
tasks.  Tasks(/goals)  are  represented  as 
operators  in  Soar,  and  are  one  of  the  main 
foci  of  its  decision  making.  Task 
decomposition  arises  from  using  the  same 
decision  mechanism  to  drive  task 
performance,  plus  Soar’s  ability  to 
automatically  generate  a  new  performance 
context  when  a  decision  is  problematic. 
When  these  mechanisms  are  combined  with 
rules  that  generate  preferences  about  which 
subtasks  are  appropriate  for  which 
problematic  parent  tasks  (in  the  particular 
situation  of  interest),  task  decomposition 
occurs.  Real-time  interaction  with  the  DIS 
environment  arises  from  the  combination  of 
Soar’s  incorporation  of  perception  and 
action  within  the  inner  loop  of  its  decision 
making  capabilities  -  thus  allowing  all 


decisions  to  be  informed  by  the  current 
situation  (and  interpretations  of  it,  as 
generated  by  rule  firings)  -  and  the  use  of 
ModSAF  (Calder  et  al,  1993)  as  the 
interface  to  the  DIS  environment  (Schwamb, 
Koss,  &  Keirsey,  1994). 

When  combined  with  the  very 
preliminary  domain  knowledge  that  was 
encoded  at  the  time,  this  combination  of 
capabilities  yielded  competent  behavior  for 
the  non-jinking  bogey  scenarios,  but  only 
fragments  of  behavior  for  the  aggressive 
bogey  scenarios  (due  to  insufficient 
knowledge  about  this  class  of  scenarios). 
One  type  of  aircraft,  similar  to  an  F14,  was 
flown  in  these  scenarios. 

The  purpose  of  this  article  is  to  provide 
a  . report,  one  year  later,  on  the  progress  in 
moving  TAS  from  the  initial  prototype 
agents  towards  the  ultimate  goal  of  human¬ 
like  automated  pilots  that  are  broadly 
capable  in  tactical  air  scenarios.  This  report 
is  intended  to  be  complemented  by  the  more 
detailed  articles  about  particular  aspects  of 
TAS  that  also  appear  in  these  proceedings 
(Johnson,  1994a;  Jones  &  lidrd,  1994; 
Jones  et  al,  1994;  Koss  &  Lehman,  1994; 
Laird  &  Jones,  1994;  Rubinoff  &  Lehman, 
1994;  Schwamb,  Koss,  &  Keirsey,  1994; 
Tambe  &  Rosenbloom,  1994;  van  Lent  & 
Wray,  1994),  rather  than  to  substitute  for 
them.  Thus,  where  there  is  a  potential 
overlap  between  this  report  and  any  of  the 
more  detailed  articles,  this  report  will 
become  more  terse  and  defer  (and  refer)  to 
the  appropriate  detailed  article(s). 

In  the  body  of  this  report,  progress  on 
domain  capabilities  will  be  covered  first. 
The  focus  here  is  on  expanding  the  classes 
of  domain  scenarios  in  which  the  agents  can 
behave  appropriately.  Second,  progress  on 
intelligent  capabilities  will  be  covered.  The 
focus  here  is  on  expanding  the  classes  of 
basic  intelligent  abilities  —  such  as  coping 
with  multiple  interacting  tasks,  plan 
recognition,  learning,  planning,  self¬ 
explanation,  and  natural  language  - 
exhibited  by  the  agents.  Third,  progress  on 
infrastructure  capabilities  —  such  as 
integration  with  the  DIS  simulation 
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environment,  low-cost  interfaces  for  human 
pilots,  knowledge  acquisition,  and 
documentation  -  will  be  covered.  Finally, 
the  article  will  be  concluded  with  plans  for 
the  future. 

Domain  Capabilities 

Progress  on  domain  capabilities  has 
occurred  in  two  general  areas:  (1)  improving 
the  robustness  and  range  of  the  scenarios 
within  1-v-l  BVR  tactical  air;  and  (2) 
scaling  up  the  scenarios  in  terms  of  Ae 
number  of  vehicles,  the  range  of  vehicle 
types,  and  the  complexity  of  the  required 
organization  and  communication  among  the 
vehicles. 

Within  1-v-l,  the  TAS  agents  can  now 
exhibit  competent  behavior  in  the  BVR 
tactical-air  segments  of  the  aggressive  bogey 
scenarios.  This  includes  the  ability  to  patrol 
in  a  racetrack  pattern;  select  radar  modes, 
detect  opponents  on  radar,  perform  search 
and  acquire  activities  when  opponents  drop 
off  of  radar,  and  maneuver  so  as  to  confuse 
the  opponent’s  search  and  acquire  activities; 
determine  and  attempt  to  achieve 
appropriate  intercept  geometries  and  launch- 
acceptability  regions  (LARs);  select,  fire, 
and  support  missiles;  and  detect  and  evade 
enemy  missiles. 

As  played  out  in  the  DIS  environment,  a 
typical  aggressive  bogey  scenario  involves 
an  F14  which  is  defending  its  aircraft  carrier 
against  possible  attack  by  a  MiG29.  The 
F14  patrols  in  a  racetrack  until  it  spots  the 
MiG29  (the  F14’s  radar  and  missiles  both 
have  longer  ranges  than  do  the  MiG29’s). 
The  F14  continues  to  monitor  the  MiG29 
until  its  commit  criteria  are  achieved,  at 
which  point  it  begins  the  intercept  by 
attempting  to  achieve  a  good  geometry  from 
which  to  fire  a  long-range  missile  (LRM). 
At  some  later  point  the  MiG29  detects  the 
F14  and  also  then  begins  an  intercept.  This 
makes  it  difficult  for  the  F14  to  achieve  any 
further  advantage  in  intercept  geometry,  so 
it  gives  up  on  that,  and  turns  to  maximize 
the  rate  of  closure  (and  thus  to  minimize  the 
time  before  the  intercept  is  complete). 


When  the  F14  is  finally  close  enough 
(that  is,  it  has  the  MiG29  within  its  LRM’s 
launch-acceptability  region),  and  is  oriented 
correctly,  it  launches  a  long-range  missile, 
and  performs  an  fpole  (a  turn  that  decreases 
the  rate  of  closure  between  the  aircraft  —  to 
delay  the  arrival  of  any  missiles  that  might 
have  been  launched  from  the  MiG29  - 
while  simultaneously  keeping  the  MiG29  on 
the  F14’s  radar).  The  MiG29  detects  the 
fpole,  and  beams  in  response,  by  turning 
perpendicular  to  the  F14  (to  render  blind  the 
Doppler  radar  that  is  guiding  the  F14’s 
missile).  The  F14  then  attempts  to  search 
for  and  reacquire  the  MiG29,  while 
simultaneously  changing  altitude  in  order  to 
confuse  the  MiG29’s  search  and  acquire 
activities. 

Both  planes  then  generally  attempt  to 
set  up  for  further  missile  launches,  and  to 
avoid  missiles  launched  by  their  opponents. 
Depending  on  the  exact  timing  of  the 
engagement,  and  on  the  willingness  of  the 
two  planes  to  take  risks  (this  is  a  TAS 
parameter),  zero,  one,  or  both  of  the  planes 
may  be  shot  down  in  the  process.  * 

This  scenario  can  be  played  out  with 
both  planes  flown  by  TAS  agents,  or  with 
one  or  the  other  flown  by  a  human  pilot  in  a 
flight  simulator.  A  formal  demonstration  of 
the  aggressive  bogey  scenario  in  the 
WISSARD  laboratory  at  Oceana  Naval  Air 
Station  during  June  ’93  successfully  pitted 
two  TAS  agents  in  simulated  F14s  against 
two  human  pilots  (in  FI 8  simulators,  but 
acting  as  MiG29s).  This  demonstration  was 
set  up  as  two  independent  1-v-l 
engagements  (out  of  radar  range  of  each 
other).  Given  the  early  state  of  development 
of  the  agents  at  the  time,  the  human  pilots 
were  constrained  in  terms  of  the  kinds  of 


Un  real  engagements,  if  one  or  more  of  the  aircraft 
survive  the  BVR  segment  of  the  scenario,  either  a 
within-visual  range  (WVR)  engagement  -  that  is,  a 
dogfight  -  or  an  air-to-ground  attack  on  the  HVU 
may  then  occur.  However,  these  aspects  of  die 
scenario  are  not  part  of  BVR  tactical  air,  and  are  thus 
not  pursued  by  the  TAS  agents. 
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tactics  they  were  allowed  to  use.  Under 
these  circumstances  the  demonstration 
proceeded  successfully,  in  real-time,  and  in 
an  otherwise  unscripted  manner.  The 
resulting  behavior  was  much  as  described  in 
the  typical  example  above.  Feedback  from 
Navy  personnel  in  attendance  at  the 
demonstration  was  uniformly  positive. 

Despite  this  demonstrable  success  - 
and  the  fact  that  in  numerous  subsequent 
presentations  to  domain  experts  and  other 
Navy  personnel  TAS  has  consistently 
impressed  with  its  quality  of  behavior  -  it 
must  be  noted  that  TAS  is  still  not  close  to 
covering  the  full  complexity  of  the  domain 
abilities  described  above,  or  the  interactions 
among  them.  For  example,  only  a  subset  of 
the  radar  modes  are  used;  search  and  acquire 
in  three  dimensions  is  not  strong;  and  only  a 
subset  of  the  possible  tactics  for  patrolling, 
confusing,  intercepting,  and  evading  are 
used.  Fleshing  out  these  abilities  does  not 
look  conceptually  difficult  at  this  point,  just 
time  consuming. 

Another  dimension  of  complexity  in  1- 
v-1  BVR  tactical  air  that  is  not  fully 
addressed  at  this  point  by  TAS  is  the  space 
of  possible  missions  that  the  agents  need  to 
be  able  to  perform.  The  aggressive  bogey 
scenarios  cover  two  types  of  missions 
(BARCAP-HVU  and  ATTACK-HVU); 
however,  there  is  still  a  handful  of  others. 
One  other  mission  to  which  TAS  has 
recently  been  extended  is  a  MiGSWEEP.  A 
MiGSWEEP  is  a  sweep  by  one  side’s 
fighters  through  the  other  side’s  territory  to 
clear  out  a  corridor  for  later  aircraft  (such  as 
bombers).  In  addition  to  the  abilities 
required  for  the  previous  missions,  a 
MiGSWEEP  requires  the  ability  to  fly  to 
waypoints,  and  to  break  off  an  intercept  and 
"blow  through"  an  opponent  (that  is,  engage 
in  a  small  amount  of  WVR  behavior  in  order 
to  accon^lish  a  high-speed  pass  of  an 
opponent  and  continue  with  Ae  planned 
flight  path). 

In  scaling  up  from  these  1-v-I 
scenarios,  the  biggest  change  has  been  the 
incorporation  of  an  ability  to  detect  and 
manage  multiple  aircraft.  In  2-v-l 


engagements  a  section  (i.e.,  a  coordinated 
pair  of  planes)  must  be  able  to  fly  together 
in  formation  and  execute  coordinated 
tactics.  In  service  of  this  they  must  be  able 
to  communicate  with  each  other,  and  to  be 
aware  of  each  other’s  positions.  The  TAS 
agent  is  now  capable  of  doing  this  (as 
discussed  in  the  next  section)  to  support 
competent  2-v-l  behavior,  within  the  same 
kinds  of  limits  described  for  1-v-l  (Jones  & 
Laird,  1994;  Laird  &  Jones,  1994). 

In  l-v-2  engagements  a  single  aircraft 
must  be  able  to  identify  and  sort  out  the 
activities  of  a  pair  of  adversaries  who  may 
or  may  not  be  flying  together  as  a  section 
(Jones  &  Laird,  1994).  It  must  be  able  to 
work  out  intercept  geometries  that  take  both 
opponents  into  account  -  so  as,  for 
example,  not  to  be  sandwiched  between 
them.  It  must  also  be  able  to  determine 
which  of  the  pair  is  the  primary  threat,  target 
the  primary  threat,  and  determine  when  to 
also  fire  at  the  secondary  threat.  For 
example,  if  the  pair  are  flying  in  a 
coordinated  fashion,  then  firing  a  missile  at 
one  is  likely  to  cause  both  to  beam.  It  would 
thus  be  a  waste  to  launch  missiles  at  both 
under  such  circumstances.  The  current  TAS 
agents  are  also  capable  of  performing 
competently  in  such  l-v-2  engagements. 

In  2-V-2  engagements,  many  of  the 
same  issues  come  up  as  in  l-v-2  and  2-v-l. 
However,  additional  capabilities  are 
required  to  sort  the  opponents  (determining 
which  friendly  aircraft  has  the  responsibility 
for  which  opponent  aircraft),  to  decide  when 
one  or  both  aircraft  should  launch  missiles, 
and  to  decide  when  to  split  the  single  2-v-2 
engagement  into  two  independent  1-v-l 
engagements  (i.e..,  to  strip).  Though  work 
on  2-V-2  has  just  recently  begun,  there  is 
now  at  least  one  working  example  of  a 
section  of  TAS  agents  successfully  sorting 
and  firing  at  another  section  of  TAS  agents. 
Once  2-V-2  is  completed,  larger 
engagements  (2-v-N,  4-v-4,  N-v-N,  etc.) 
will  still  remain  to  be  covered. 

Other  aspects  of  scale  up  that  are 
currently  in  progress  include  adding  the 
ability  to  fly  an  FI  8  (to  the  original  F14  and 
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the  recently  added  MiG29),  and  the  addition 
of  an  air  intercept  control  (AIC)  agent  in  an 
E2  (a  specialized  radar  plane  that  is  similar 
to  an  AW  ACS)  (Rubinoff  &  Lehman,  1994). 
The  AIC’s  job  is  different  in  a  number  of 
ways  from  that  of  a  fighter  pilot,  so 
stretching  TAS  to  accommodate  this  new 
type  of  agent  should  force  further 
generalization  of  its  capabilities. 

Intelligent  Capabilities 

With  respect  to  intelligent  capabilities, 
the  most  significant  advance  over  the 
prototype  agents  has  been  the  addition  to 
TAS  of  the  ability  to  maintain  episodic 
memories  of  its  engagements,  and  to  use 
these  memories  in  reconstructing  what  it 
did,  why  it  did  what  it  did,  and  what  else  it 
would  have  done  if  the  situation  had  been 
slightly  different  (Johnson,  1994a;  Johnson, 
1994b).  These  description  and  explanation 
capabilities  are  available  through  an 
interactive  debriefing  interface,  in  which 
questions  can  be  asked  via  selection  from 
dynamically  created  menus,  and  answers  are 
generated  in  (approximate)  English.  In 
contrast  to  explanation  in  most  expert 
systems,  where  there  is  a  distinct 
"explanation"  system  that  has  direct  access 
to  the  "performance"  system’s  knowledge 
and  derivational  traces,  TAS  generates  the 
explanations  itself  based  only  on  (I)  what  it 
can  remember  about  what  happened  and  (2) 
what  it  can  later  reconstruct  about  what  it 
might  have  done  (and  why  it  might  have 
done  it).  This  is  a  process  that  can  be  misled 
by  circumstances,  but  it  is  expected  to  be 
more  like  how  human  pilots  would  actually 
describe  and  explain  their  own  behavior 
during  post-mission  debriefing  (though- the 
psychological  and  behavioral  accuracy  of 
this  has  not  yet  been  studied). 

In  addition  to  these  debriefing 
capabilities,  significant  progress  has  also 
b^n  made  on  incorporating  several  other 
capabilities  into  TAS.  One  capability  is 
coping  with  multiple  interacting  goals. 
Timugh  mapping  a  forest  of  interacting 
goals  onto  the  single  goal  stack  maintained 


by  Soar  has  turned  out  to  be  non-trivial  - 
and  is  currently  a  topic  of  intensive 
investigation  -  workable  strategies  have 
been  found  for  TAS  agents  to  coordinate 
their  behavior  in  the  presence  of  all  of  these 
goals  and  their  interactions  (Jones  et  al, 
1994).  A  second  capability  is  integrating 
information  from  multiple  sources  about 
multiple  agents  (Jones  &  Laird,  1994).  The 
sources  of  information  about  other  agents 
have  been  expanded  from  just  radar,  to  also 
include  radio  and  vision;^  and  the  number  of 
agents  about  which  information  can  be 
represented  has  been  expanded  from  one  up 
to  an  arbitrary  number.  A  third  capability  is 
communication  and  coordination  among 
multiple  agents  (Laird  &  Jones,  1994). 
Instead  of  modeling  a  group  of  related 
agents  -  such  as  a  section  of  aircraft  or  a 
platoon  of  tanks  -  as  a  single  aggregate 
unit,  the  behavior  of  groups  is  being 
modeled  at  the  individual  platform  level. 
This  provides  additional  flexibility  and 
realism  •  in  the  simulation,  but  also 
necessitates  modeling  how  the  groups 
actually  do  communicate  and  coordinate 
among  themselves. 

Additional  capability  investigations  are 
also  underway  in  the  areas  of  learning, 
planning,  plan  recognition  and  natural 
language.  Learning  and  planning  are  a 
relatively  common  part  of  Soar’s  repertoire 
of  behaviors  in  general  (Laird  & 
Rosenbloom,  1990);  however,  they  are  not 
yet  a  routine  part  of  TAS’s  behavior. 
Investigations  of  their  use  in  TAS  have 
begun  -  for  example,  the  debriefing 
capability  depends  on  learning  being  active 
within  certain  key  portions  of  the  TAS 
agents  -  but  it  is  too  early  to  comment 
generally  on  their  outcome.  In  contrast,  plan 
recognition  is  now  a  routine  part  of  TAS’s 


^The  radar,  vision,  and  radio  inputs  attempt  to 
provide  TAS  with  the  information  a  human  would 
extract  from  those  sources.  However,  this 
information  is  provided  symbolically,  and  no  actual 
visual  or  audio  processing  on  the  part  of  the  agent  is 
required. 
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behavior,  but  only  of  a  simple,  low-level,  ad 
hoc  variety.  For  example,  when  an 
opponent  turns,  a  new  (hand-coded)  task 
may  be  selected  to  interpret  whether  the 
opponent  is  performing  an  fpole  (as  part  of  a 
missile  launching  plan)  or  a  beam  (as  part  of 
a  missile  evasion  plan).  General  plan 
recognition  turns  out  to  be  particularly 
difficult  in  the  DIS  environment  because  of 
the  presence  of  partial  information  about 
multiple,  flexible,  interacting  agents. 
However,  a  more  systematic  approach  based 
on  abstract  model  tracing  (Anderson  et  al, 
1990;  Ward,  1991)  in  (multi-agent)  world- 
centered  models  is  being  investigated  in  a 
version  of  TAS,  and  is  showing  some 
promise  (Tambe  &  Rosenbloom,  1994). 
Finally,  an  investigation  is  in  progress  on 
how  to  incorporate  independently 
developed.  Soar-based,  natural-language 
abilities  (Lehman,  Lewis,  &  Newell, 
1991)  into  TAS  (Rubinoff  &  Lehman, 
1994).  In  theory,  two  automated  agents 
could  communicate  without  using  natural 
language;  however,  to  do  so  can  affect  how 
they  are  perceived  by  agents  that  are 
eavesdropping  on  them.  In  the  longer  run, 
natural  language  is  also  a  critical  capability 
if  automated  agents  are  ever  to  interact  in  a 
seamless  way  with  human  agents.  Natural 
language  communication  will  initially  be 
provided  between  a  pair  of  TAS  agents  -  a 
Hghter  and  an  AIC  (in  an  E2)  -  with  further 
deployment  hopefully  to  follow. 

Infrastructure  Capabilities 

With  respect  to  infrastructure,  progress 
has  been  made  on  four  topics:  (1)  integration 
of  Soar  with  the  DIS  simulation 
environment;  (2)  provision  of  a  low-cost 
interface  for  human  pilots;  (3)  knowledge 
acquisition  methodology;  and  (4) 
documentation  tools  and  methodology. 
These  topics  are  covered  in  turn  here. 

TAS  agents  are  now  able  to  act  as  full 
participants  within  the  DIS  battlefield 
simulation  enviroiunent.  The  key  to  this  has 
been  the  use  of  ModSAF  1.0  (Calder  et  al, 
1993)  as  an  intermediary  between  Soar  and 


DIS  (Schwamb,  Koss,  &  Keirsey,  1994). 
ModSAF  already  contains  an  interface  to 
DIS,  so  it  was  only  necessary  to  add  an 
interface  between  Soar  and  ModSAF.  To  do 
this  we  have  implemented  a  cockpit 
abstraction  on  top  of  ModSAF  that  allows 
TAS  to  focus  on  behaving  like  a  pilot,  while 
ModSAF  simulates  vehicles,  sensors,  and 
weapons.  TAS  is  not  utilizing  ModSAF’s 
own  pilot  behaviors  (such  as  Sweep,  CAP, 
and  Hy  Route),  as  prograimned  into  its  tasks 
and  task  frames;  however,  TAS’s  piloting 
task  has  been  simplified  somewhat  by 
providing  it  high-level  flight  control  via  a 
ModSAF  library  function  that  accepts  as 
parameters  a  desired  altitude,  heading,  etc. 
In  addition  to  adding  the  cockpit  abstraction 
(and  getting  Soar  to  use  it),  we  have 
extended  the  implementation  of  Soar  so  as 
to  allow  multiple  independent  Soar  agents 
within  a  single  process.  This  has  allowed 
multiple  TAS  agents  to  be  compiled 
together  with  ModSAF  in  a  single  process,^ 
and  thus  allowed  communication  between 
the  agents  and  ModSAF  to  be  mediated 
directly  by  calling  library  functions  (rather 
than  through  slower  interprocess 
communication  mechanisms,  such  as 
sockets). 

Given  a  cockpit  abstraction,  it  turned 
out  to  be  relatively  easy  to  reuse  it  in 
support  of  a  low-cost  interface  for  human 
control  of  ModSAF  aircraft.  The  Human 
Instrument  Panel  (HIP)  provides  an  X- 
Windows-based  interface  to  a  vehicle’s 
cockpit  abstraction  (van  Lent  &  Wray, 
1994).  This  enables  a  human  pilot  to 
perceive  graphically-presented  sensor 
information  and  to  control  the  aircraft’s 
flight,  weapons,  and  sensors  at  the  same 
level  at  which  they  are  controlled  by  TAS 
agents.  Easily  being  able  to  control 
ModSAF  vehicles  at  this  level  of  detail,  and 
on  any  workstation,  has  proven  to  be  quite 
useful  in  testing  and  experimenting  with 


3soar  is  currently  implemented  in  C  -  as  is 
ModSAF  -  without  which  this  integration  would 
have  been  considerably  more  difficult. 
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TAS  agents.  However,  the  HIP  clearly  can’t 
completely  replace  the  functionality  of 
higher  fidelity  (and  cost)  flight  simulators. 

With  respect  to  knowledge  acquisition, 
the  most  important  development  has  been 
the  opening  of  the  WISSARD  laboratory  at 
Oceana  Naval  Air  Station  (in  Norfolk,  VA). 
The  lab  contains  two  high  fidelity  (dome) 
aircraft  simulators;  two  medium  fidelity 
aircraft  simulators;  plus  workstations  for 
running  ModSAF,  TAS,  and  several 
visualization  and  analysis  tools.  The 
laboratory  has  enabled  us  to  add  to  the 
standard  knowledge  acquisition 
methodologies  the  ability  to  watch,  tape,  and 
log,  engagements  among  human  pilots  (both 
official  "subject  matter  experts",  as  well  as 
operational  pilots),  and  engagements 
between  human  pilots  and  TAS  agents. 

With  respect  to  documentation,  we  have 
developed  substantial  portions  of  a  three 
layer  hypertext  document  that  links  together: 
(1)  knowledge  about  the  domain  (as 
extracted  from  books,  experts,  etc.);  (2)  a 
description  of  the  structure  and  content  of 
TAS;  and  (3)  the  actual  rules  that  comprise 
TAS  (Koss  &  Lehman,  1994).  This 
documentation  has  been  developed  within 
NCSA  Mosaic,  a  distributed,  multi-media, 
hypertext  system.  It  is  expected  to  facilitate 
understanding  and  validation  of  the 
knowledge  and  code  embodied  in  the 
automated  agents. 

Summary  and  Future 

TAS  is  now  capable  of  performing 
competently  in  beyond-visual  range  tactical- 
air  scenarios  containing  up  to  three 
interacting  aircraft.  Moreover,  it  can  do  so 
while  flying  two  types  of  aircraft  in  service 
of  three  types  of  missions.  It  can  also 
participate  in  interactive  post-mission 
debriefings  about  its  engagements. 

These  various  capabilities  arise  from 
combining  knowledge  about  the  tactical  air 
domain  with  a  set  of  "intelligent"  abilities 
embodied  by  TAS  for  knowledge-based 
decision  making,  reactive  real-time 
interaction,  coping  with  multiple  interacting 


goals,  integrating  information  from  multiple 
sources  about  multiple  agents, 
communication  and  coordination,  episodic 
memory,  and  reconstructive  self-description 
and  self-explanation. 

The  basic  TAS  agent  is  coded  within 
Soar  via  14S  operators,  where  each  operator 
corresponds  to  a  task  (or  goal)  at  some  level 
of  granularity.  In  terms  of  rules,  the 

implementation  involves  approximately 

1,S(X).  Most  of  these  rules  are  responsible 
for  proposing,  selecting,  and  applying  the 
operators,  but  some  do  perform  other  tasks 
(such  as  encoding  perceptual  input,  and 
elaborating  state  descriptions).  The 
debriefing  capability  adds  another  80 
operators,  amounting  also  to  approximately 
1,500  rules.  So  the  combined  system 
consists  of  225  operators  and  approximately 
3,000  rules.  The  natur^  language 

capabilities  that  are  currently  being  added 
utilize  an  additional  56  operators,  and 
approximately  900  rules.  Note  that  these 
operator  and  rule  counts  are  all  "before 
learning",  as  learning  can  increase  both  the 
number  of  rules  and  the  number  of 
operators. 

Beyond  the  agent  itself,  progress  has 
also  been  made  on  building  an  infrastructure 
to  support  research  and  development  on 
intelligent  automated  agents  for  tactical  air, 
and  beyond. 

Plans  for  the  coming  year  include 
completing  2v2  BVR  tactical  air,  and 
transitioning  TAS  from  tactical  air  to  close 
air  support  (a  form  of  air-to-ground 
engagement).  We  also  expect  to  have 
planning,  learning,  and  plan  recognition 
working  routinely  in  TAS,  and  to  have 
limited  amounts  of  natural  language  also  in 
routine  use.  Meanwhile,  incremental 
improvements  are  expected  to  continue  on 
the  infrastructure  for  research  and 
development. 
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Abstract 

Computer  generated  battlefield  agents  need  to  be 
able  to  explsdn  the  rationales  for  their  actions. 
Such  explanations  make  it  easier  to  validate  agent 
behavior,  and  can  enhance  the  effectiveness  of  the 
agents  as  training  devices.  This  paper  describes 
an  explanation  capability  called  Debrief  that  en¬ 
ables  agents  implemented  in  Soar  to  describe  and 
justify  their  decisions.  Debrief  determines  the  mo¬ 
tivation  for  decisions  by  recalling  the  context  in 
which  decisions  were  made,  and  determining  what 
factors  were  critical  to  those  decisions.  In  the  pro¬ 
cess  Debrief  learns  to  recognize  similar  situations 
where  the  same  decision  would  be  made  for  the 
same  reasons.  Debrief  currently  being  used  by  the 
TacAir-Soar  tactical  air  agent  to  explain  its  ac¬ 
tions,  and  is  being  evaluated  for  incorporation  into 
other  reactive  planning  agents. 

1  Introduction 

The  Soar-IFOR  project  [15]  is  developing  intelli¬ 
gent  agents  that  can  control  tactical  aircraft  in  dis¬ 
tributed  battlefield  simulations.  A  key  objective 
of  the  project  is  to  endow  such  simulated  agents 
with  human-like  behavior.  Human  players  in  the 
simulation  should  not  be  able  to  tell  which  units 
are  controlled  by  humans  and  which  are  controlled 
by  computer,  lest  they  come  to  rely  on  this  knowl¬ 
edge.  Simulations  can  serve  as  an  effective  test  bed 
for  development  and  evaluation  of  tactics  only  if 
the  agents  realistically  employ  those  tactics. 

Yet  it  is  difficult  to  validate  through  observation 
that  agent  behavior  really  is  human-like.  Behav¬ 
ior  depends  upon  the  agent’s  goals  and  situation 
assessments,  and  these  can  change  from  moment 
to  moment.  A  given  action  might  be  appropriate 
in  one  situation,  and  altogether  inappropriate  in  a 
slightly  different  situation.  Therefore  the  fact  that 


an  agent  happens  to  behave  realistically  in  one  sce¬ 
nario  is  no  guarantee  that  the  agent  will  perform 
properly  in  other  scenarios.  In  order  to  gain  con¬ 
fidence  in  the  accuracy  of  the  agent’s  behavior  it 
is  helpful  to  examine  its  reasoning  processes,  and 
compare  them  against  human  reasoning. 

In  order  to  produce  human-like  behavior  we  have 
focussed  on  modeling  human  thought  processes 
and  reasoning,  using  the  Soar  cognitive  architec¬ 
ture  [14].  These  thought  processes  are  made  visi¬ 
ble  using  an  explanation  capability,  called  Debrief, 
that  describes  and  answers  questions  about  the 
agent’s  actions  and  decisions.  Debrief  can  also 
point  out  alternative  actions  that  might  have  been 
taken,  but  were  rejected.  This  helps  to  ensure  that 
the  2tctions  were  performed  for  the  right  reasons, 
and  were  not  chance  occurrences. 

Debrief  was  inspired  by  post-flight  debriefings  in 
tactical  iraining.  Debriefings  are  conducted  after 
training  exercises  so  that  instructors  rmd  trainees 
can  understand  what  went  wrong  and  why,  and 
draw  lessons  that  can  be  applied  to  future  en¬ 
gagements.  Similu  capabilities  in  Debrief  make 
it  easier  for  people  to  understand  and  improve  the 
performance  of  simulated  agents. 

2  Objectives  and  Approach 

The  following  are  the  design  objectives  for  Debrief 
in  TacAir-Soar. 

1.  It  should  describe  an  engagement  from  the 
agent’s  perspective,  explaining  what  the 
agent’s  objectives  where,  what  actions  it  took 
to  meet  those  objectives,  and  its  assessment 
of  the  unfolding  situation. 

2.  It  should  accept  follow-on  questions  about 
those  actions,  objectives,  and  assessments, 
justifying  them  as  appropriate. 
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3.  In  explaining  actions  it  should  use  a  combi¬ 
nation  of  media  familiar  to  potential  users, 
including  natural  language  and  diagrams. 

4.  These  capabilities  should  be  provided  without 
unnecessary  impact  on  the  design  and  imple¬ 
mentation  of  other  agent  capabilities. 

5.  The  explanation  capability  should  not  be  spe¬ 
cific  to  tactical  air-to-air  combat,  but  should 
be  applicable  to  a  variety  of  domains. 

The  most  obvious  way  to  provide  such  explana¬ 
tions  would  be  to  generate  English  paraphrases  of 
the  rules  and  rule  firing  traces  used  by  TacAir- 
Soar.  Yet  such  techniques  have  proved  to  be  in¬ 
effective  for  explaining  expert  system  reasoning 
[4,  16,  3],  and  are  likely  to  be  inappropriate  in 
computer  generated  forces  as  well.  They  contain 
too  many  implementation  details,  and  are  too  de¬ 
pendent  upon  the  particulars  of  how  knowledge  is 
encoded  in  the  system.  It  is  necessary  to  abstract 
away  from  these  details,  and  focus  on  the  essential 
knowledge  underlying  the  agent’s  decisions. 

Another  approach  might  be  to  encode  the  under¬ 
lying  knowledge  explicitly,  either  in  a  declarative 
form  or  as  abstract  meta-rules  [2].  Such  an  ap- 
proaudi  has  a  number  of  potential  problems.  First 
of  all,  computer  generated  forces  require  a  great 
variety  of  reasoning  capabilities,  including  plan¬ 
ning,  plan  recognition,  learning,  and  geometric 
reasoning.  The  problem  solving  strategies  and  do- 
maun  knowledge  representations  required  for  many 
of  these  capabilities  are  matters  of  current  re¬ 
search.  If  such  knowledge  were  encoded  declare^ 
lively,  but  the  agent  does  not  make  direct  use  of  it, 
and  instead  employs  procedurad  rules  or  codes,  the 
declarative  descriptions  will  quickly  become  out  of 
dade  as  the  agent  is  extended  and  modified.  This  is 
especially  true  for  experimental  intelligent  systems 
like  ThcAir-Soar.  Yet  if  it  did  reason  directly  from 
declarative  representations  its  performance  would 
suffer,  amd  the  design  of  the  agent  would  be  greatly 
affected,  in  contradiction  to  point  4  listed  above. 
Automated  compilation  of  declarative  knowledge 
[13]  can  help  eliminate  the  performance  problems, 
but  compilation  techniques  have  not  yet  been  em¬ 
ployed  in  systems  that  have  have  as  great  a  variety 
of  reasoning  capabilities  as  TacAir-Soar  does. 

Debrief  takes  a  fundamentally  different  approach 
to  explanation.  In  order  to  determine  the  ratio¬ 
nales  for  an  agent’s  decision,  it  recalls  the  situar 
tion  in  which  the  decision  was  taken,  and  then  re¬ 
plays  the  agent’s  tactical  reasoning  processes.  By 


monitoring  the  reasoning,  and  experimenting  with 
changing  the  situation  in  various  ways,  it  discov¬ 
ers  the  critical  factors  in  the  situation  that  led  to 
the  decision.  These  results  are  learned  so  that 
they  can  be  applied  to  other  decisions  in  simi¬ 
lar  situations.  In  effect  the  agent  constructs  a 
declarative  model  of  its  own  reasoning  through  re¬ 
flection.  This  is  anadogous  to  human  explanation 
generation — experts  often  can  perform  tasks  with¬ 
out  being  conscious  of  the  rationales  underlying 
their  performance,  and  must  reflect  afterwarck  to 
determine  what  those  rationales  might  have  been. 

Although  Debrief  was  originally  intended  for  ex¬ 
planation  in  the  tactical  air-to-air  domain,  the 
implementation  is  not  specific  to  that  domain. 
TacAir-Soar  is  in  the  process  of  being  adsq>ted 
to  handle  air-to-ground  operations;  it  is  expected 
that  these  changes  will  little  or  no  impact  on  De¬ 
brief.  Plans  are  underway  to  apply  Debrief  to  an 
entirely  different  domain,  namely  automated  con¬ 
trol  of  radar  tracking  stations  in  the  NASA  Deep 
Space  Network  [8,  7]. 


3  An  Example 

The  following  example  scenario  illustrates  how  De¬ 
brief  is  employed.  Suppose  that  the  TacAir-Soar 
agent  is  assigned  a  Barrier  Combat  Air  Patrol 
(BARCAP)  mission,  i.e.,  to  search  the  skies  for  en¬ 
emy  lurcraft  and  intercept  them  so  that  they  can¬ 
not  threaten  a  high  value  unit  such  as  an  aircraft 
carrier.  During  the  course  of  the  mission  the  agent 
detects  a  hostile  aircraft.  The  agent  intercepts  the 
aircraft,  fires  a  missile  at  it  which  destroys  it,  and 
then  resumes  its  patrol. 

After  the  engagement  Debrief  can  be  used  to  ask 
the  agent  questions.  Dialog  is  conducted  via  a 
menu-oriented  interface.  First,  the  user  requests 
that  Debrief  describe  what  took  place  during  the 
engagement;  it  then  gives  a  step-by-step  descrip¬ 
tion  of  the  mission.  If  there  is  any  statement  in  the 
description  that  the  user  has  a  question  about,  he 
or  she  can  button  on  it  with  the  mouse  and  request 
an  elaboration  or  explanation. 

Figure  1  shows  part  of  the  display  during  the 
course  of  the  interaction.  The  user  has  selected 
one  of  the  statements  in  the  description,  “I  started 
using  my  weapons,”  and  has  requested  a  justificar 
tion  for  that  decision.  The  explanation  for  this 
step  is  displayed  in  the  window  shown;  the  origi¬ 
nal  description  of  the  mission  has  almost  entirely 
scrolled  off  the  top  of  the  window.  It  lists  sev- 
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Figure  1:  The  Debrief  interaction  window 


eral  reasons  why  the  agent  elected  to  use  weapons 
against  the  bogey:  the  agent  had  radar  contact 
with  the  bogey,  it  was  known  to  be  hostile,  the 
rules  of  engagement  (ROE)  were  satisfied,  and  the 
agent  had  already  planned  an  intercept  trajectory 
for  closing  in  on  the  bogey.  It  also  lists  alternative 
actions  that  it  might  have  taken  but  did  not;  for 
example,  if  ROE  had  not  been  satisfied,  the  agent 
would  have  closed  in  on  the  bogey,  but  would  have 
re&ained  from  firing  weapons  at  it. 

It  is  also  possible  to  investigate  why  the  agent 
reached  particular  conclusions  during  the  course  of 
the  engagement.  For  example,  since  the  conclusion 
that  ROE  was  achieved  was  crucial  to  the  agent’s 
dectnon  to  employ  weapons,  it  would  be  useful  to 
find  out  why  the  agent  reached  this  conclusion. 
This  can  be  accomplished  by  selecting  a  conclu- 
non  and  asking  a  follow-up  question  about  it.  In 
the  window  in  the  figure  the  conclusion  "ROE  was 
achieved”  has  been  selected  with  the  mouse,  and 
an  explanation  for  why  this  conclusion  was  made 
appears  at  the  bottom  of  the  figure. 

4  Architecture  of  Debrief 

Figure  2  shows  the  overall  architecture  of  De¬ 
brief,  and  how  Debrief  fits  into  the  architecture 


of  ThcAir-Soar.  Like  all  Soar  systems,  TacAir- 
Soar  is  divided  into  problem  spaces,  each  of  which 
is  responsible  for  a  particular  subtask.  These 
problem  spaces  are  organized  hierarchically,  where 
lower  level  problem  spaces  accomplish  gosJs  that 
are  posed  in  the  higher  level  spaces.  The  top 
level  space  performs  the  switch  l^tween  principal 
modes  of  operation,  namely  accepting  mission  or¬ 
ders,  flying  the  mission,  and  debriefing  the  mis¬ 
sion. 

In  order  to  support  debriefing,  the  mission  prob¬ 
lem  spaces  are  augmented  with  an  event  mem¬ 
ory,  which  is  a  record  of  the  events  that  occurred 
during  the  mission.  A  set  of  operators  and  pro¬ 
ductions  monitor  TacAir-Soar’s  problem  solving 
state  during  the  execution  of  the  mission  in  or¬ 
der  to  construct  this  memory.  A  separate  work¬ 
ing  memory  specification  indicates  which  objects 
in  Soar’s  internal  working  memory  should  be  mon¬ 
itored  for  state  changes.  After  the  engagement  De¬ 
brief  retrieves  information  about  the  engagement 
from  the  event  memory  and  uses  it  to  describe 
and  explain  the  mission.  The  working  memory 
specification  als^  elps  to  determine  how  to  re¬ 
call  episodes  fro,  .  the  event  memory  and  analyze 
them  and  what  information  about  those  episodes 
is  presented  to  the  oser. 

The  debriefing  itself  is  performed  within  the  De¬ 
brief  problem  space,  which  alternates  between 
prompting  the  user  for  questions  via  the  Prompt- 
for-Question  problem  space,  and  answering  them 
via  the  Generate-Answer  problem  space.  Gener¬ 
ating  answers  involves  recalling  states  in  which 
events  occurred,  analyzing  the  rationales  for  the 
events  or  the  beliefs  that  the  agent  held  at  the 
time,  and  then  presenting  the  results  to  the  user. 
The  natural  language  generation  capability  within 
the  presentation  subsystem  is  also  used  in  the 
Prompt-for-Question  problem  space  to  construct 
menus  of  events  and  decisions  that  the  user  may 
select  from  when  forming  a  question. 

The  key  aspects  of  each  of  these  capabilities  within 
Debrief  will  be  described  in  more  detml  below. 


5  Inputing  Questions 

Analysis  of  videotapes  of  mock  post-flight  debrief¬ 
ings  indicates  that  a  variety  of  types  of  questions 
can  arise  during  the  course  of  a  debriefing.  The 
question  input  capability  is  designed  to  enable 
users  to  pose  many  of  these  types  of  questions, 
without  making  users  type  their  questions  in  En- 


Figure  2:  The  architecture  of  Debrief  within  TacAir-Soar 


gUsh  and  requiring  Debrief  to  understand  natural 
language  input. 

Questions  were  categorized  into  major  semantic 
types,  following  the  methodology  that  is  com¬ 
mon  in  question-answering  systems  [10].  Ques¬ 
tion  types  currently  supported  include;  Describe- 
Event — describe  an  action  or  event  and  its  cir¬ 
cumstances;  Ebcplain-Action — explain  why  the 
agent  performed  a  particular  action;  Explain- 
Cionclusion — explain  why  the  agent  drew  a  partic¬ 
ular  conclusion;  and  Explsin-Belief — explain  why 
the  agent  believed  that  a  particular  fact  was  true. 
Instead  of  inferring  the  question  type  from  the 
user’s  input,  Debrief  requires  that  the  user  ex- 
plidtly  sdect  a  question  type  from  a  menu.  This 
avoids  the  problems  of  interpreting  poorly  articu¬ 
lated  questions,  but  does  require  that  the  user  un¬ 
derstand  the  meaning  of  the  question  types,  and 
to  understand  the  distinction  being  made  between 
actions  and  conclusions. 

Each  question  applies  to  a  specific  event,  decision, 
or  belief.  These  can  be  selected  via  the  inter¬ 
face.  Initially  when  the  user  selects  a  question 
type  Debrief  lists  the  events  in  its  event  memory 
that  are  of  the  appropriate  type  for  the  question. 
The  user  may  then  select  an  element  from  this  list. 


Subsequent  questions  can  be  formed  in  the  same 
manner,  or  by  selecting  fragments  of  text  with  the 
mouse,  using  a  technique  similar  to  that  employed 
by  Moore  and  Swartout  [12].  In  the  latter  case 
the  question  is  taken  to  refer  to  the  event  or  belief 
described  by  that  fragment  of  text. 

6  Memory  and  Recall 

Event  memory  in  Debrief  takes  advantage  of  the 
fact  that  the  major  problem  solving  steps  in 
Soar  systems,  namely  problem  spaces  and  opera¬ 
tors,  are  represented  explicitly  in  working  memory. 
Events  and  decisions  are  recorded  by  productions 
that  ched(  for  particular  operators  being  applied. 
All  major  decisions  within  TacAir-Soar  are  imple¬ 
mented  as  operators,  as  are  situation  assessments. 
For  example,  the  decision  to  use  weapons  is  per¬ 
formed  by  an  operator  called  Employ- Weapons,  so 
a  production  was  added  to  TacAir-Soar  that  fires 
whenever  that  operator  is  applied.  An  event  token 
is  then  added  to  the  event  memory  indicating  the 
operator  that  was  applied  and  the  problem  space 
in  which  it  was  applied. 

Whenever  an  event  is  added  to  the  event  memory, 
T^cAir-Soar’s  working  memory  is  checked  to  see  if 
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there  have  been  any  changes  since  the  last  event. 
Soar’s  memory  is  organized  as  a  collection  of  ob¬ 
jects  with  attributes  and  values.  Debrief  monitors 
a  subset  of  these  attributes,  and  adds  a  record  to 
the  event  memory  whenever  a  change  in  values  oc¬ 
curs. 

The  decision  of  what  state  information  record  de¬ 
pends  upon  what  information  is  required  during 
debriefing  to  explain  decisions.  Recording  all  state 
information  would  be  costly  in  a  complex  system 
such  as  ThcAir-Soar,  and  proves  to  be  unneces¬ 
sary.  The  working  memory  specification  deter¬ 
mines  what  information  should  be  monitored  and 
recorded.  For  each  attribute  of  interest  the  specifi¬ 
cation  indicates  what  types  of  values  the  attribute 
can  take,  whether  or  not  an  attribute  can  as¬ 
sume  multiple  values  at  once,  and  how  those  values 
change  over  time.  If  the  values  of  the  attributes 
are  themselves  complex  objects  with  attributes, 
they  are  specified  in  the  same  way. 

It  was  argued  in  Section  2  that  duplicate  represen¬ 
tations  of  knowledge  in  declarative  and  procedural 
form  can  lead  to  maintenance  problems.  However, 
this  is  not  a  serious  problem  in  the  case  of  the 
working  memory  specification  because  the  spec¬ 
ification  only  describes  the  structure  of  working 
memory,  not  how  that  working  memory  is  con¬ 
structed  and  used.  It  is  therefore  relatively  im¬ 
mune  to  modifications  to  the  TacAir-Soar  agent’s 
rule  base.  Additionally,  Debrief  can  optionally  be 
made  to  check  whenever  the  working  memory  state 
disagrees  with  the  specification,  and  warn  the  de¬ 
velopers  that  this  is  the  case.  The  advantage  is 
that  Debrief  can  be  incorporated  into  a  new  Soar 
system  simply  by  identifying  the  operators  in  that 
system  that  are  to  be  explained  and  constructing 
a  specification  for  the  system’s  working  memory. 

It  is  important  that  the  event  and  state  recording 
processes  not  add  significantly  to  working  memory 
size,  since  this  could  degrade  the  run-time  perfor¬ 
mance  of  TacAir-Soar.  Therefore  Lebrief  makes 
use  of  Soar’s  learning  mechanism,  called  chunk¬ 
ing,  in  order  to  reduce  working  memory  load. 
Chunking  creates  new  production  rules  that  Soar 
can  then  use  in  subsequent  problem  solving.  De¬ 
brief  builds  so-called  recognition  chunks  during 
the  course  of  the  engagement  in  order  to  facili¬ 
tate  the  recall  of  state  information.  A  recogni¬ 
tion  chunk  will  fire  whenever  a  situation  that  Soar 
has  encountered  before  arises  again,  enabling  Soar 
to  recognize  that  the  situation  has  been  encoun¬ 
tered  before.  Debrief  continually  builds  recog¬ 
nition  chunks  during  the  engagement,  associat¬ 


ing  state  changes  with  events.  During  debrief¬ 
ing  the  Recall-State  problem  space  makes  use  of 
these  chunks  to  reconstruct  the  state  in  which  a 
given  event  occurs.  It  proposes  a  range  of  possi¬ 
ble  state  changes,  and  if  recognition  chunks  fire 
Debrief  then  knows  that  the  state  change  was  as¬ 
sociated  with  the  event.  Once  Recall-State  has 
reconstructed  the  working  memory  state  associ¬ 
ated  with  an  event  a  new  chunk  is  built  associ¬ 
ating  the  event  with  the  complete  state  descrip¬ 
tion.  Then  if  subsequent  questions  refer  back  to 
the  same  event  the  diunk  leads  to  the  immediate 
recall  of  the  working  memory  state. 

During  the  engagement  TacAir-Soar’s  working 
memory  size  stays  relatively  constant,  but  its  pro¬ 
duction  memory  constantly  grows  as  new  chunks 
are  built.  One  might  therefore  be  concerned 
that  the  additional  productions  would  lead  to  de¬ 
creased  performance.  Fortunately,  studies  have 
shown  that  Soar  systems  can  be  run  with  as  many 
as  a  million  chunks  in  them  without  signiicant 
slowdown[5].^  This  number  of  chunks  is  far  greater 
than  Debrief  has  yet  been  required  to  produce. 

7  Explaining  Decisions  and  Beliefs 

Once  the  circumstances  surrounding  a  decision  has 
been  recalled,  it  is  then  possible  for  Debrief  to  de¬ 
termine  what  aspects  of  those  circumstances  led  to 
the  decision.  Figure  3  shows  the  problem  spaces 
that  are  involved  in  this  process.  The  first  step  is 
to  replay  the  original  decision,  to  verify  that  cir¬ 
cumstances  leading  to  the  decision  have  been  cor¬ 
rectly  recalled.  In  essence  Debrief  is  performing 
a  kind  of  "what-iP  simulation  internal  to  Soar, 
checking  to  see  what  TacAir-Soar  would  do  if  it 
were  in  the  recalled  situation.  Interaction  between 
TacAir-Soar  and  ModSAF  is  disallowed,  so  that 
this  what-if  simulation  does  not  have  an  unin¬ 
tended  effect  on  the  ModSAF  vehicle  that  TacAir- 
Soar  is  controlling. 

For  example,  in  the  case  of  the  decision  to  use 
weapons  described  above.  Debrief  recalls  the  op¬ 
erator  that  was  applied  (Employ-Weapons),  the 
problem  space  in  which  it  was  applied  (Intercept), 
and  the  state  in  which  it  is  applied  (reconstructed 
by  Recall-State).  This  information  is  passed  to  the 
Evaluate-Decision  problem  space,  which  in  turn 
passes  control  to  a  problem  space  called  Test  Oper¬ 
ator  Applicability  that  sets  up  the  Intercept  prob¬ 
lem  space  so  that  it  can  be  reinvoked  in  the  re- 
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Figure  3:  Problem  spaces  for  evaluating  decisions 

called  state.  Then  control  passes  to  the  Inter¬ 
cept  problem  space.  If  Intercept  agun  selects  the 
Employ-We^ons  operator,  Debrief  knows  that 
the  recalled  state  description  contains  the  infor¬ 
mation  that  motivated  the  original  application  of 
Employ-Weapons.  Test  Operator  Applicability 
then  immediately  terminates  the  what-if  simular 
tion  and  returns  to  Evaluate-Decision  an  indica¬ 
tion  that  the  expected  operator  was  in  fact  ap¬ 
plied. 

Reconsidering  the  original  decision  made  during 
engagement  is  necessary  for  two  reasons.  First,  it 
is  possible  that  the  recalled  state  does  not  corre¬ 
spond  exactly  to  the  situation  in  which,  the  de¬ 
cision  was  originally  made.  This  could  happen 
if  the  TacAir-Soar  operator  was  modifying  work¬ 
ing  memory  at  the  same  time  that  working  mem¬ 
ory  was  being  recorded.  To  deal  with  this  case, 
a  problem  space  called  Establish-Applicability  is 
employed  to  modify  the  recalled  state  by  compar¬ 
ing  it  against  the  state  associated  with  the  im¬ 
mediately  preceding  event  and  trying  to  construct 
state  intermediate  between  the  two  in  which  the 
operator  can  fire.  But  even  if  the  right  state  was 
recalled  it  is  important  to  reconsider  the  deci¬ 
sion  because  it  causes  chunks  to  be  built  during 


the  process.  These  chunks  summarize  the  condi¬ 
tions  in  the  recalled  state  that  caused  the  Employ- 
Weapons  operator  to  be  selected.  Implementa¬ 
tion  details  internal  to  the  Intercept  problem  space 
that  caused  the  operator  to  be  selected  are  au¬ 
tomatically  filtered  out  by  the  chunking  process. 
The  details  of  how  this  filtering  occurs  is  beyond 
the  scope  of  this  paper,  but  please  see  [9]. 

Once  it  is  determined  that  the  recalled  operator 
is  applicable,  the  next  step  is  to  determine  what 
would  happen  if  the  situation  were  slightly  differ¬ 
ent  from  what  was  recalled.  This  helps  to  iden¬ 
tify  what  the  critical  factors  are  in  the  state,  and 
why  they  are  critical.  This  analysis  is  performed 
in  the  Determine  Applicability  Criteria  problem 
space.  This  space  repeatedly  deletes  elements 
from  the  recalled  state  description  and  checks  to 
see  whether  the  originally  selected  operator  would 
be  applied.  If  the  state  change  does  not  affect 
the  applicability  the  the  operator,  a  chunk  pre¬ 
viously  built  by  Test-Operator-Applicability  will 
fire,  recognizing  that  the  operator  is  still  applica¬ 
ble.  If  the  state  change  is  significant,  the  chunk 
will  not  fire  and  what-if  simulation  will  again 
be  performed  in  the  Test-Operator-Applicability 
space.  If  it  is  found  that  a  different  operator  is 
selected,  the  name  of  the  operator  is  returned  to 
Determine-Applicability-Criteria,  which  then  will 
perform  further  what-if  analyses  to  determine  why 
that  operator  was  selected.  Finally,  the  results 
of  these  analyses  are  returned  as  sets  of  signif¬ 
icant  attribute  values  associated  with  each  se¬ 
lected  operator.  This  is  yet  another  point  where 
Soar’s  learning  mechanism  is  used  to  advantage. 
The  next  time  TacAir-Soar  ^plies  the  Employ- 
Weapons  operator  in  a  similar  situation  and  De¬ 
brief  is  asked  a  question  about  it,  it  will  immedi¬ 
ately  be  able  to  recognize  the  similarity  of  the  situ¬ 
ation  and  produce  an  explanation,  without  having 
to  perform  any  what-if  simulation  at  all. 

Explaining  beliefs  involves  many  of  the  same 
mechamisms  that  are  used  to  evaluate  decisions. 
Debrief  searches  backwards  through  the  event  his¬ 
tory  for  the  first  event  whose  associated  state  in¬ 
cludes  the  belief  in  question.  Then  Debrief  re¬ 
moves  the  belief  from  the  state  description,  and 
performs  a  what-if  simulation  of  the  operator  as¬ 
sociated  with  the  event.  If  during  simulation  the 
belief  is  added  back  to  the  state,  that  indicates 
that  the  operator  was  responsible  for  asserting  the 
belief  into  Soar’s  working  memory.  Determine- 
Applicability-Criteria  can  then  be  used  to  deter¬ 
mine  why  that  particular  operator  was  selected. 


26 


8  Presenting  Explanations 

Once  the  necessary  analysis  is  performed  by 
Debrief,  information  is  presented  to  the  user. 
This  presentation  is  performed  via  a  hierarchi¬ 
cal  presentation  planning  process,  initiated  in  the 
Present  problem  space  shown  in  Figure  2.  The 
planning  process  is  similar  to  that  of  other  multi- 
media  generation  systems  [6, 1],  although  its  abil¬ 
ity  to  coordinate  text  and  graphics  is  somewhat 
limited. 

8.1  Selecting  information  to  present 

The  first  step  in  the  presentation  process  is  select¬ 
ing  what  information  should  be  presented.  If  the 
question  that  was  asked  involved  evaluating  a  de¬ 
cision  or  belief,  this  step  is  trivial:  every  factor 
that  was  found  to  lead  to  the  decision  or  belief 
in  question  is  presented.  If  the  user  requested  a 
summary  of  an  event,  however,  the  case  is  more 
complicated.  Debrief  has  a  wealth  of  information 
available  about  every  event,  in  the  form  of  the 
state  information  associated  with  the  event  and 
any  substeps  of  the  event.  It  must  therefore  de¬ 
termine  which  pieces  of  information  are  relevant. 

Relevance  is  determined  by  constructing  and 
maintaining  a  model  of  what  the  user  is  expected 
to  know  about  the  engagement.  This  is  deter¬ 
mined  initially  through  a  short  questionnaire  that 
the  user  fills  out  when  he  or  she  first  sits  down  with 
the  system.  The  user  indicates  the  level  of  famil¬ 
iarity  with  the  misnon  orders,  and  with  what  actu¬ 
ally  transpired  during  the  engagement.  Depending 
upon  the  answers  to  these  questions  Debrief  will 
copy  mote  or  less  information  from  ThcAir-Soar’s 
working  memory  into  the  user  model.  Later  on 
when  Debrief  is  planning  a  summary  of  a  given 
event,  it  compares  the  recalled  state  associated 
with  the  event  against  the  user  model.  If  corre¬ 
sponding  information  is  already  in  the  user  model 
then  it  is  not  presented. 

If  a  piece  of  information  is  not  present  in  the  user 
model,  Debrief  next  checks  whether  it  is  readily  in¬ 
ferrable  from  other  information  that  has  already 
been  selected  for  presentation.  In  particular.  De¬ 
brief  has  knowledge  about  what  facts  are  read¬ 
ily  inferrable  as  consequences  of  particular  events. 
Any  such  facts  are  omitted  from  the  explanation. 

After  each  event  is  described  to  the  user,  the  user 
model  is  updated.  All  information  that  has  been 
presented,  or  which  is  known  to  be  inferrable  from 
what  was  presented,  is  added  to  the  model.  How¬ 


ever,  it  is  assigned  a  lower  degree  of  confidence — 
just  because  Debrief  tells  the  user  some  fact  does 
not  mean  that  the  user  then  knows  it.  If  the  user 
requests  an  elaboration  about  a  particulu  event, 
information  assigned  to  the  user  model  with  a  low 
degree  of  confidence  will  still  be  presented. 

8.2  Assigning  information  to  media 

Once  information  has  been  selected  for  presenta¬ 
tion,  Debrief  must  then  determine  what  presenta¬ 
tion  media  to  use.  It  currently  has  knowledge  of 
two  presentation  media:  natural  language  and  a 
graphical  display  of  aircraft  positions  in  S-space. 
Each  presentation  medium  is  specified  in  terms 
of  the  types  of  information  it  is  able  to  present: 
the  gr^hical  display  is  limited  in  its  expressibil- 
ity,  whereas  natural  language  is  unlimited.  E)ach 
piece  of  information  is  then  allocated  to  the  avail¬ 
able  media  depending  upon  the  type  of  presenta¬ 
tion  being  given.  If  a  summary  description  is  be¬ 
ing  presented,  only  one  medium  will  be  selected, 
and  graphical  media  will  be  preferred  over  textual 
media.  Otherwise  all  available  media  will  be  used. 

8.2.1  Generating  the  presentations 

At  the  present  time  the  only  medium  Debrief  can 
employ  is  natural  language,  because  the  interface 
that  will  allow  Debrief  to  control  the  display  pro¬ 
grammatically  is  not  yet  complete.  As  soon  as  the 
interface  is  complete,  it  will  be  possible  for  De¬ 
brief  to  start  presenting  the  information  that  it 
is  already  able  to  assign  to  that  medium.  In  the 
mean  time,  the  presentations  are  in  natural  lan¬ 
guage  instead.  Natural  language  is  produced  us¬ 
ing  a  simple  sentence  generator  loosely  based  on 
Functional  Unification  Grammar  [11]. 

9  Status  and  Evaluation 

The  Debrief  system  as  it  currently  stands  com¬ 
prises  thirteen  problem  spaces,  implemented  us¬ 
ing  eighty  Soar  operators  and  1556  productions. 
It  currently  can  describe  and/or  explain  a  total  of 
70  types  of  events.  The  natural  language  gener¬ 
ation  component  has  a  vocabulary  of  240  words 
and  phrases.  It  has  been  used  to  describe  and  ex¬ 
plain  events  occurring  in  1  v  1  engagements,  1  v  2 
engagements,  and  2  v  1  engagements. 

Formative  evaluations  of  Debrief  explanations 
have  been  performed  with  Navy  Reserve  fighter 
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pilots.  These  evaluations  confirmed  that  expla¬ 
nations  are  extremely  helpful  for  validating  the 
agent’s  performance,  and  building  confidence  in 
it.  They  also  underscored  the  importance  of  hav¬ 
ing  the  agent  justify  its  beliefs — the  evaluators  fre¬ 
quently  wanted  to  ask  questions  about  assertions 
made  by  Debrief  during  the  course  of  the  explana^ 
tion.  This  experience  motivated  the  work  on  in¬ 
corporating  the  Explain-Belief  question  type  into 
Debrief.  Further  evaluations  and  demonstrations 
are  planned  for  later  this  year. 
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Abstract 

laldfigeat  artificial  agents  need  to  be  able  to 
eqfiuB  aad  jnstily  their  actions.  They  must 
ther^ore  understand  the  rationales  for  their  own 
actions.  This  p^>er  describes  a  technique  fior 
acquiring  this  understanding,  implement^  fai  a 
multimedia  erplanation  qrstm.  The  system  de¬ 
termines  the  motivation  for  a  decision  by  recall- 
iag  the  situation  in  which  the  dedsion  was  made,  ■ 

'  aad  replaying  the  decision  under  variaats  of  the 
original  situation.  Throngh  experimentation  the 
agent  is  able  to  discover  what  factors  led  to  the 
decisions,  aad  what  alternatives  might  have  been 
dosea  h^  the  situation  been  slightly  different. 

The  agent  learns  to  recognise  «itnamr  situations 
where  the  same  dednoa  wonld  be  made  for  the 
same  reasons.  This  approach  k  implemented  in 
aa  artificial  fighter  inlot  that  can  explain  the  mo- 
tivathms  for  its  actions,  aitnatioa  assessments, 
aad  beUeb. 

Introduction 

InteUigeut  artificial  agents  need  to  be  able  to  provide 
explanations  and  justifications  for  the  actions  that  they 
take.  This  is  especially  true  for  computer-generated 
forces,  i.e.,  computer  agents  that  operate  within  bat- 
tkfidd  nmulati^.  Such  nmulaticms  are  eq>ected  to 
have  aa  increamngly  important  role  in  the  evaluation 
of  missions,  tactics,  doctrines,  and  new  weapons  sys¬ 
tems,  and  in  training  (Jones  1993).  Validati  *10  of  such 
forces  is  critical — they  should  behave  as  humans  would 
in  rimilar  drcumstances.  Yet  it  is  difficult  to  val¬ 
idate  bdiaviot  thimi^  external  observation;  bdiav- 
ior  dq;>ends  upon  the  agent’s  assessment  of  the  situa¬ 
tion  and  its  changing  gmis  from  mennent  to  moment. 
Ikainees  can  giently  benefit  fiom  autennated  fences 
that  can  eaqrlrin  th^  actions,  so  that  the  trainees  can 
leam  how  experts  bdiave  in  various  situatiems.  Po¬ 
tential  users  computer-generated  forces  therefore  at- 
tadi  great  impentance  to  explanation,  just  as  potential 
users  of  cmnputer-based  m^cal  con^tation  ^sterns 
do  (Ihadi  k  Shmtliife  1984). 

Eqtlanations  baaed  on  traces  rule  firings  or  parsr 
phraM  of  rules  tend  not  to  be  successful  (Davis  1976; 


Swartout  k  Moore  1993;  Clancey  1983b).  Th^  con¬ 
tain  too  many  implementation  details,  aad  lade  in¬ 
formation  about  the  domain  and  about  rationales  for 
the  design  of  the  system.  More  advanced  explanation 
techniques  encode  domain  knowledge  and  problem¬ 
solving  strategies  and  employ  them  in  problmn  solv¬ 
ing  eithtt  as  metarules  (Clancy  1983a)  or  in  com¬ 
piled  form  (Neches,  Swartout,  k  Mome  1985).  In  the 
computer-generated  forces  domain,  howeva,  pr<d>lem- 
solving  stratepes  and  domain  knowledge  representa- 
tions  ate  matters  of  current  research.  An  intdligent 
agent  in  such  a  domain  must  integrate  capabilities  of 
perception,  reactive  problem  solving,  planning,  plan 
recognition,  learning,  geometric  reasoning  and  visual¬ 
isation,  among  others,  all  under  severe  real-time  cc»i- 
straints.  It  is  difficult  to  apply  meta-Ievd  or  compilsr 
tion  iq>proaches  in  such  a  way  that  all  of  these  require¬ 
ments  can  be  met  at  once. 

This  p^>er  describes  a  system  called  Debrief  that 
takes  a  different  approach  to  explanation.  Explana¬ 
tions  are  constructed  after  the  fact  by  recalling  the 
rituation  in  which  a  decision  was  made,  reconridering 
the  decisimi,  and  through  experimentation  determin¬ 
ing  what  factors  were  critical  for  the  decirion.  These 
factors  are  critical  in  the  sense  that  if  th^  were  not 
present,  the  outcome  of  the  decirion  process  would 
have  been  different.  Details  of  the  agent’s  implementa¬ 
tion,  such  as  which  individual  rules  aq>plied  in  making 
the  decirion,  are  automatically  filter^  out.  It  is  not 
necessary  to  maintain  a  complete  trace  of  rule  firings 
in  order  to  produce  explanations.  The  relationships 
between  situational  factors  and  dedrions  ate  learned 
so  that  they  can  be  iqtplied  to  similar  dedrions. 

Hiis  m>proach  of  basing  explanations  on  abriract  as- 
sodations  between  dedrions  and  situational  factms  has 
rimilarities  to  the  REX  system  (Wide  k  Tlunnpson 
1989).  But  while  REX  requires  one  to  create  a  sq>- 
arate  knowledge  base  to  support  explanation,  Debrirf 
auUnnatically  learns  much  of  whai  it  needs  to  kiunr 
to  generate  explanations.  The  approach  is  related  to 
te^niques  for  acquiring  domain  modds  through  ex¬ 
perimentation  (Gil  1993),  except  that  the  agent  learns 
to  model  not  the  external  world,  but  itsdf. 
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Figute  1:  Part  of  a  an  event  summary 


Debrief  is  implemented  as  part  of  the  l^Air-Soar 
fighter  pilot  simulation  (Jones  et  at.  1993).  Debrief 
can  describe  and  justify  dedsions  using  a  combination 
of  natural  language  and  diagrams.  It  is  written  in  a 
domain-independent  fashion  so  that  it  can  be  read¬ 
ily  incorporated  into  other  intelligent  systems.  Cur¬ 
rent  plans  call  for  incorporating  it  into  the  REACT 
system,  an  intelligent  assistant  for  operators  of  NASA 
Deep  Space  Network  ground  tracking  stations  (Hill  k, 
Joh^n  1994). 


An  Example 

Omsider  the  following  scenario.  A  fighter  is  assigned 
a  Combat  Air  Patrol  (CAP)  mission,  i.e.,  it  should  fly 
a  loop  pattern,  scanning  for  enemy  aircraft.  During 
the  misnon  a  bogey  (an  unknown  aircraft)  is  qiotted 
on  the  radar.  The  ESC,  an  aircraft  whose  purpose 
is  to  scan  the  utspace  and  provide  information  to  the 
fighters,  confirms  that  the  bogey  is  hostile.  The  fighter 
doses  in  <m  the  bogey,  fires  a  mi^e  which  destroys  the 
bogey,  and  then  resumes  its  patrol. 

After  each  mission  it  is  customary  to  debrief  the  pi¬ 
lot.  The  pilot  is  asked  to  describe  the  engagement 
fimn  his  perspective,  and  explain  key  decisions  along 
the  way.  The  pUot  must  justify  his  assessments  of  the 
ntuation,  e.g.,  why  the  bogey  was  considered  a  threat. 

ThcAir-Soar  is  ^le  to  simulate  pilots  executing  mis- 
d<Hi8  such  as  this,  and  Debrief  is  able  to  answer  ques¬ 
tions  about  than.  TacAir-Soar  controls  a  simulation 
environment  called  ModSAF  (Calder  et  at.  1993)  that 
nmulates  the  behavior  of  military  platforms.  TacAir- 
Soar  receives  information  from  Mo^AF  about  aircraft 
status  and  radar  information,  and  issues  commands  to 
fly  the  simulated  drcraft  and  employ  weapons.  After 
an  engagement  users  can  interact  with  Debrief  to  ask 
questions  about  the  engagement. 

The  following  is  a  typical  interaction  with  Debrief. 


Cbainil  Queaioa 
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Figure  2:  Ebcpianations  of  the  agent’s  decisions 


Questions  are  entered  through  a  window  interface,  by 
ejecting  a  type  of  question  and  pointing  to  the  event  or 
assertion  that  the  question  refers  to.  The  first  question 
selected  »  of  type  Describe-Event,  i.e.,  descri^  some 
event  that  took  place  during  the  engagement;  the  event 
chosen  is  the  entire  mission.  Debrief  then  generates  a 
summary  of  what  took  place  during  the  misnon.  The 
user  is  free  to  select  statements  in  the  summary  and 
ask  foUow-on  questions  about  them. 

Figure  1  shows  part  of  a  typical  mission  summary. 
One  of  the  statements  in  the  summary,  *1  started  using 
my  weapons,”  has  been  selected  by  the.  user,  so  that 
a  folldw-on  question  may  be  asked  about  it.  Figure  2 
shows  the  display  at  a  later  point  in  the  dialog,  after 
follow-on  questions  have  been  asked.  First,  a  ques¬ 
tion  of  type  Ehcplain- Action  was  asked  of  the  dedsion 
to  employ  we^ions,  i.e.,  explain  why  the  agent  chose 
to  perform  this  action.  The  explan^on  rqipears  in  the 
figure,  beginning  with  the  sentence  *I  started  using  my 
we^Kins  because  the  intercept  geometry  was  sdected 
and...”  Debrief  also  lists  an  action  that  it  did  not  take, 
but  might  have  taken  under  slightly  different  circum¬ 
stances:  flying  toward  the  bogey  to  decrease  distance. 

One  can  see  that  the  agent’s  actions  are  motivated 
largely  by  previous  assessments  and  decisions.  The 
bottom  of  Figure  2  shows  the  answer  to  a  follow-on 
question  relating  to  one  of  those  assessments,  namely 
"ROE  was  achieved,”^  Debrief  lists  the  following  fae- 


*ROE  stands  for  Rules  of  Engagement,  Le.,  the  condi¬ 
tions  under  whid  the  fighter  is  authorised  to  engage  the 
enenqr. 
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tots:  the  bogey  was  known  to  be  hostile  (i.e.,  a 
“bandit”),  the  bogey  was  identified  through  electronic 
means  and  confirmation  of  the  identification  was  ob¬ 
tained  from  the  E2C. 

In  order  to  answer  such  questions,  Debrief  does  the 
following.  First,  it  recalls  the  events  in  question  and 
the  situations  in  which  the  events  took  pUoe.  Wben 
summarising  events,  it  selects  information  about  the 
intttmediate  states  and  subevents  that  should  be  pre¬ 
sented,  selects  iq>propriate  media  for  presentation  of 
this  ii^ormation  (the  graphical  display  and/or  natu¬ 
ral  language),  and  then  generates  the  presentations. 
Tb  determine  what  factors  in  the  rituation  led  to  the 
action  or  oonclurion.  Debrief  invokes  the  TbcAir-Soar 
problem  srdver  in  the  recalled  situation,  and  observes 
what  actions  the  problem  solver  takes.  The  rituation 
is  then  repeatedly  and  systematically  modified,  and 
the  effects  on  the  problem  solver’s  decisions  are  ob¬ 
served.  Belie&  are  explained  by  recalling  the  rituation 
in  which  the  belieb  arose,  determining  what  decisions 
caused  the  bdiefii  to  be  asserted,  and  determining  what 
factors  were  responsible  for  the  decisions. 

Implementation  Concerns 
Debrief  is  implemented  in  Soar,  a  problem-solving  ar- 
clutecture  that  implements  a  theory  of  human  cogni- 
tion(Newell  1990).  Problems  in  Soar  are  represented 
as  goals,  and  are  solved  within  problem  spaces.  Ek^h 
problem  space  conrists  of  a  state,  represented  as  a  set 
of  attribute-value  pairs,  and  a  set  of  operators.  All  pro¬ 
cessing  in  Soar,  including  applying  operators,  propos¬ 
ing  problem  spaces,  and  constructing  states,  u  per¬ 
formed  by  productions.  During  problem  solving  Soar 
repeatedly  selects  and  applies  operators  to  the  state. 
When  Soar  is  unable  to  make  progress,  it  creates  a  new 
subgoal  and  problem  space  to  determine  how  to  pro¬ 
ceed.  Results  from  these  subspaces  are  saved  by  Soar’s 
diunking  mechanism  as  new  productions,  which  can  be 
a4;>plied  to  similar  rituations. 

The  explanation  techniques  employed  in  Debrief  ate 
not  Soar-apecific;  however,  they  do  take  advantage  of 
certain  features  of  Soar. 

•  The  explicit  problem  space  representation  enables 
Debrief  to  monitor  problem  solving  when  construct¬ 
ing  explanatirms. 

•  Since  Soar  applications  are  implemented  in  produc¬ 
tion  rules,  it  is  fairly  straightforward  to  add  new 
rules  for  explanation-related  procesring. 

•  Learning  enables  Debrief  to  reuse  the  results  of  pre¬ 
vious  explanation  procesring,  and  build  up  knowl¬ 
edge  alxmt  the  application  domain. 

Ibe  current  implementation  of  Debrief  consists  of 
thirteen  Soar  problem  spaces.  Two  are  responsible  for 
inputing  questions  from  the  user,  three  recall  events 
and  states  from  memory,  four  determine  the  motiva¬ 
tions  for  actions  and  b^efs,  three  generate  presenta¬ 
tions,  and  one  provides  top-level  control.  The  follow¬ 


ing  sections  describe  the  system  components  involved 
in  determining  motivations  for  decisions  and  beliefs; 
other  parts  of  the  system  are  described  in  (Johnson 
1994). 

Memory  and  Recall 

In  order  for  Debrief  to  describe  and  e;q>lain  decisions, 
it  must  be  able  to  recall  the  deciri<Ni8  and  the  ritua^ 
tions  in  which  they  occurred.  In  <»der  words,  the  agent 
requires  an  episodic  memory.  Debrief  includm  produc¬ 
tions  and  operators  that  execute  during  the  problem 
solving  process  in  order  to  record  episodic  information, 
and  a  problem  apace  called  Recall-State  that  recon¬ 
structs  states  using  this  episodic  informatkm. 

The  choice  of  what  episodic  information  to  record 
is  determined  by  a  specification  of  the  agent’s  working 
memory  state.  This  specification  identifies  the  state 
attributes  that  are  relevant  for  explanatirm,  and  iden¬ 
tifies  thrir  properties,  e.g.,  thrir  cardinality  and  rigna- 
ture,  and  how  the  attribute.values  may  change  during 
problem  solving.  In  order  to  apply  Debrief  to  a  new 
problem  solver,  it  is  necessary  to  supply  such  a  specifi¬ 
cation  for  the  contents  of  the  problem  sriver’s  working 
memory,  and  indicate  which  operators  implement  de¬ 
cisions  what  should  be  explainable.  However,  it  is  not 
necessary  to  specify  how  the  problem  solver  uses  its 
working  memory  in  making  decisions — that  is  deter¬ 
mined  by  Debrief  automatically. 

When  the  problem  solver  applies  an  operator  that 
as  marked  as  explainable.  Debrief  records  the  operator 
application  in  a  list  of  events  that  took  place  during 
the  problem  solving.  It  also  records  all  attribute  values 
that  have  changed  since  the  last  problem  solving  event 
that  was  recorded. 

Debrief  then  builds  chunks  that  associate  the  state 
changes  with  the  problem  solving  evoit.  Once  these 
chunb  are  built,  the  state  changes  can  be  deleted  from 
working  memory,  because  the  dunks  are  sufficient  to 
enable  Debrief  to  recall  the  working  memory  state. 
During  e:q>lanation,  when  Debrief  needs  to  reull  the 
state  in  which  a  problem  solving  event  occurred,  it 
invokes  the  Recall-State  problem  space.  This  space 
reconstructs  the  state  by  proposing  possible  attribute 
values;  the  chunks  built  previously  fire,  selecting  the 
value  that  was  associated  with  the  event.  Recall-State 
aggregates  these  values  into  a  copy  of  the  state  at  the 
time  of  the  original  event,  and  returns  it.  This  result 
is  chunked  as  weU,  enabliiig  Debrief  immediately  to  re¬ 
call  the  state  assodated  with  the  event  should  it  need 
to  refer  back  to  it  in  the  future.  Hiis  process  is  an 
instance  of  data  chunking,  a  common  medianism  for 
knowledge-level  learning  in  Soar  systems  (Rosenbloom, 
Laird,  &  NeweU  1987). 

Debrief  thus  makes  extensive  use  of  Soar’s  long  term 
memory,  i.e.,  chunks,  in  constructing  its  episodic  mem¬ 
ory.  In  a  typical  TacAir-Soar  run  several  hundred  such 
chunks  are  created.  This  is  more  economical  than  rim- 
ply  recording  a' trace  of  production  firings,  since  over 
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Figure  3:  The  process  of  evaluating  decisions 


6000  productions  fire  in  a  typical  TacAir-Soar  run. 
Since  Soar  has  been  shown  be  i^le  to  handle  memories 
oontuning  hundreds  of  thousands  of  chunks  (Dooren- 
bos  1993),  there  should  be  little  difficulty  in  scaling  up 
to  more  complex  problem  solving  applications. 

Explaining  Actions  and  Conclusions 

Suppose  that  the  user  requests  the  motivation  for  the 
action  *I  started  using  my  weapons.”  Debrief  recaUs 
the  type  of  event  involved,  operator  that  was  iq>plied, 
the  problem  space  in  which  it  was  applied,  and  the 
problem  solving  state.  In  this  case  the  event  type  is 
Statt>Event,  i.e.,  the  beginning  of  an  operator  appli* 
cation,  the  operator  is  named  Employ- Weiqpons,  and 
the  problem  space  is  named  Intercept.  The  situation 
was  one  where  the  agent  had  dedded  to  intercept  the 
bogqy,  and  had  just  decided  what  path  to  follow  in  per¬ 
forming  the  intercept  (called  the  intercept  geometry). 

Analysis  of  recalled  events  such  as  this  proceeds  as 
shown  if  Figure  3.  The  first  step,  testing  applicabil¬ 
ity,  verifies  that  TacAir-Soar  would  select  an  Employ- 
Weap<ms  operator  in  the  recalled  state.  An  oporator 
called  Test-Operator- Applicability  performs  tl^  veri¬ 
fication,  by  setting  up  a  *mentd  simulation”  of  the 
ori^al  de^on,  and  monitoring  it  to  see  what  opera¬ 
tors  ate  selected. 

This  initial  test  of  operator  iqiplicability  is  impor¬ 
tant  for  the  following  reasons.  State  changes  ate  not 
rec<»ded  in  episodic  memory  until  the  operator  has 
already  been  selected.  The  operator  ought  therefore 
modify  the  state  before  Debrief  has  a  chance  to  save 
it,  m^ng  the  operator  iniqiplicable.  This  is  not  a 
problem  in  the  case  of  Employ-Weapons,  but  if  it  were 
Debrief  would  attempt  to  establish  applicability,  which 
involves  recalling  the  state  immediately  preced^  the 
state  of  the  event,  and  trying  to  find  an  interpolation 
of  the  two  states  in  which  the  operator  would  be  se¬ 
lected.  But  even  when  recalling  the  precise  problem 
solving  state  is  not  a  problem,  verifying  appUcability 


is  useful  because  it  causes  chunks  to  be  built  that  fa¬ 
cilitate  subsequent  analysis. 

After  a  state  has  been  found  in  which  the  recalled 
operator  is  applicable,  the  next  step  is  to  determine 
applicability  criteria,  i.e.,  identify  what  attributes  of 
*be  state  are  responsible  for  the  operator  being  se¬ 
lected.  This  also  involves  applying  the  Test-Operator- 
Applicability  operator  to  construct  mental  nm^ations. 

Mental  simulation 

Given  the  problem  space  Intercept,  the  recalled  state, 
the  operator  Employ- Weapons,  and  the  decision  Start- 
event(Employ- Weapons),  Test-Operator- Applicability 
operates  as  follows.  It  creates  an  instance  of  the  In¬ 
tercept  problem  space  as  a  subspace,  and  asngns  as  its 
state  a  copy  of  the  recalled  state.  The  wtnrking  memory 
specification  described  above  is  helpful  here:  it  deter¬ 
mines  which  attributes  have  to  be  copied.  This  state 
is  marked  as  a  simulation  state,  which  activates  a  set 
of  productions  responsible  for  monitoring  mental  sim¬ 
ulations.  Test-Operator-Applicability  copies  into  the 
simulation  state  the  event  and  the  category  of  dednon 
being  evaluated.  There  are  three  such  categwies:  per¬ 
ceptions,  which  recognize  and  register  some  external 
stimulus,  conclusions,  which  reason  about  the  situa¬ 
tion  and  draw  inferences  from  it,  and  actions,  which 
are  operations  that  have  some  effect  on  the  external 
world.  Employ- Weai^ns  is  thus  an  action.  The  In¬ 
tercept  problem  space  is  disconnected  6om  extnnal 
sensors  and  effectors  (the  ModSAF  simulator),  so  that 
mental  simulation  can  be  freely  performed.'  Execution 
then  be^ns  in  the  problem  space.  The  first  q;>erator 
that  is  sdected  is  Employ- WeiH>bns.  The  m<Hutoring 
productions  recognize  this  as  the  demred  operator,  re¬ 
turn  a  flag  to  the  parent  state  indicating  that  the  de¬ 
nted  event  was  observed,  and  the  mental  simulation  is 
terminated.  If  a  different  operator  or  event  had  been 
selected  instead.  Debrief  would  be  checked  to  see  if 
it  is  of  the  same  category  as  the  expected  operator, 
i.e.,  another  action.  If  not,  simulation  is  permitt'd  to 
continue;  otherwise  simulation  is  terminated  and  the 
a  description  of  the  operator  that  applied  instead  is 
returned. 

Whenever  a  result  is  returned  from  mental  simula¬ 
tion,  a  chunk  is  created.  Such  chunks  may  then  be 
applicable  to  other  situations,  making  further  men¬ 
tal  simulation  unnecessary.  Figure  4  shows  the  diunk 
that  is  formed  when  Debrief  simulates  the  selection  of 
the  Employ-Weapons  operator.  The  conditions  of  the 
chunk  appear  before  the  symbol  — » and  actions  follow. 
Variables  are  symbols  surrounded  by  angle  brackets, 
and  attributes  are  preceded  by  a  carat  (A).  Tlie  condi¬ 
tions  include  the  expected  operator,  Emplc^-WeqxHis, 
the  problem  space.  Intercept,  and  properties  <ff  the 
state,  aU  properties  of  the  bogey.  U  the  operator  is 
found  to  be  inapplicable,  a  different  chunk  is  produced, 
that  indicates  which  operator  is  selected  instead  of  the 
expected  one. 
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Figure  4:  An  example  chunk 

These  chunks  built  during  mental  simulation  have 
an  important  feature — ^they  omit  the  details  of  how  the 
operator  and  problem  space  involved  is  implemented. 
This  is  an  inherent  feature  of  the  chimki^  process, 
which  traces  the  results  of  problem  solving  in  a  prob¬ 
lem  space  back  to  elements  of  the  supergoal  problem 
space  state.  In  this  case  the  state  recalled  from  episodic 
memory  is  the  part  of  the  supergoal  problem  space 
state,  so  elements  of  the  recall^  state  go  into  the  left 
hand  side  of  the  chunk. 

Determiiung  the  cause  for  decisions 
At  this  point  it  would  be  useful  to  examine  the  chunks 
built  during  mental  simulation  in  order  to  proceed  to 
generate  the  explanation.  Unfortunately,  productions 
in  a  Soar  system  are  not  inspectable  within  Soar.  This 
limitation  in  the  Soar  architecture  is  deliberate,  reflect¬ 
ing  the  difficulty  that  humans  have  in  introspecting  on 
their  own  memory  processes.  It  does  not  a  serious 
problem  for  Debrief,  because  the  chunks  built  during 
mental  Emulation  can  be  used  to  recognize  which  at¬ 
tributes  of  the  state  are  signiflcant. 

The  identification  of  significant  attributes  is  per¬ 
formed  in  the  Determine-Applicability-Criteria  prob- 
lon  wf9at,  which  removes  attributes  one  by  one  and  re¬ 
peatedly  applies  Test-Operator- Applicability.  If  a  dif¬ 
ferent  operator  is  selected,  then  the  removed  attribute 
must  be  significant.  If  the  value  of  a  ngnificant  at¬ 
tribute  is  a  complex  object,  then  each  attribute  of  that 
object  is  analyst  in  the  same  way;  the  same  is  true  for 
any  significant  values  of  those  attributes.  Meanwhile, 
if  the  variants  resulted  in  different  operators  bong  se¬ 
lected,  the  iq>plicability  criteria  for  these  operators  are 
identified  in  tiie  same  manner.  This  generate-and-test 
approach  has  been  used  in  other  Soar  systems  to  enlist 
recognition  chunks  in  service  of  problem  solving  (Vera, 
Lewis,  k,  Lerch  1993),  and  is  nndlar  to  Debrief’s  mech¬ 
anism  for  reconstructing  states  from  episodic  memory. 


Since  the  state  representations  are  hierarchically  orga¬ 
nized,  the  significant  attributes  are  found  quickly. 

If  chunking  were  not  taking  place,  Debrief  would  be 
performing  a  long  series  of  mental  simulations,  most  of 
which  would  not  yield  much  useful  information.  But 
the  chunks  that  are  created  help  to  ensure  that  vir¬ 
tually  every  mental  simulation  uncovers  a  significant 
attriWte,  for  the  following  reason.  Subgoals  are  cre¬ 
ated  in  Sw  only  when  impasses  occur.  Test-Operator- 
Applicability  instantiates  the  mental  simulation  prob¬ 
lem  space  because  it  tries  to  determine  whether  the 
recalled  operator  is  applicable,  is  unable  to  do  so,  and 
reaches  an  impasse.  When  chunks  such  as  the  one  in 
Figure  4  fire,  they  assert  that  the  operator  is  ^pli- 
cable,  so  no  impasse  occurs.  Mental  simulation  thus 
occurs  only  in  situations  that  fail  to  match  the  chunks 
that  have  been  built  so  far.  In  the  case  of  the  Employ- 
Wel^>ons  operator,  a  total  of  seven  mental  simulations 
of  variant  states  are  required:  two  to  determine  that 
the  bogey  is  relevant,  and  five  to  identify  the  bogey’s 
relevant  attributes. 

Furthermore,  even  these  mental  simulations  become 
unnecessary  as  Debrief  gains  experience  explaining 
missions.  Suppose  that  Debrief  is  asked  to  explun  a 
different  Employ-Weapons  event.  Since  most  of  the 
significant  features  in  the  situation  of  this  new  event 
are  likely  to  be  similar  to  the  significant  features  of  the 
previous  situation,  the  chunks  built  from  the  previous 
mental  simulations  will  fire.  Mental  simulation  is  re¬ 
quited  for  the  situational  features  that  are  different,  or 
ff  the  operator  was  selected  for  different  reasons. 

Two  kinds  of  chunks  are  built  when  Determine- 
Applicability-CMteria  returns  its  results.  One  type 
identifies  all  of  the  significant  features  in  the  situa- 
tion  in  which  the  decision  was  made.  The  other  type 
identifies  an  operator  that  might  have  applied  instead 
of  the  expected  operator,  and  the  state  in  which  the 
operator  applies.  These  chunks  are  created  when  men¬ 
tal  umulation  determines  that  an  operator  other  than 
the  expected  one  is  selected.  Importantly,  the  chunks 
fire  whenever  a  similar  decision  is  made  in  a  similar 
situation.  By  accumulating  these  chunks  Debrief  thus 
builds  an  attract  model  of  the  application  domra, 
associating  decirions  with  their  rationales  and  alter¬ 
natives.  The  problem  solver’s  performance-oriented 
knowledge  is  reorganized  into  a  form  suited  to  sup¬ 
porting  explanation. 

Performing  mental  simulation  in  modified  states 
complicates  mental  simulation  in  various  respects.  The 
result  of  deleting  an  attribute  is  often  the  Section  of 
an  operator  in  mental  umulation  to  reassert  the  same 
attribute.  Debrief  must  therefore  monitor  the  umu¬ 
lation  and  detect  when  deleted  attributes  are  bung 
reasserted.  The  modified  state  may  cause  the  problem 
solver  to  fail,  resulting  in  an  impasse.  Mental  sim¬ 
ulation  must  therefore  distinguish  impasses  that  are 
a  normal  result  of  problem  solving  from  impasses  that 
suggest  that  the  problem  solvu  is  in  an  erroneous  state. 
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There  is  one  shortcoming  of  the  analysis  technique 
described  here.  Chunking  in  Soar  cannot  always  ba^- 
trace  throi^  negated  conditions  in  the  left  hand  sides 

productions.  Therefore  if  the  problem  solver  opted 
for  a  decision  because  some  condition  was  absent  in 
the  ntuation,  Debrief  may  not  be  able  to  detect  it. 
Developers  of  Soar  systems  get  around  this  problem  in 
chunking  by  using  explicit  values  such  as  *unkno«vn* 
to  indicate  that  information  is  absent.  This  same  tech¬ 
nique  enables  Debrief  to  identify  the  factors  involved. 

Relationship  to  other  exploratory  learning 
approaches 

The  closest  correlate  to  Debrief’s  decision  evaluation 
capability  is  Gil’s  work  on  learning  by  experimentation 
(Gil  1993).  Gil’s  EXPO  system  keeps  tra^  of  operator 
applicatimis,  and  the  states  in  which  those  operators 
were  iq>plied.  If  an  operator  is  found  to  have  differ¬ 
ent  effects  in  different  situations,  EXPO  compares  the 
states  to  determine  the  differences.  Another  system  by 
Scott  and  Markovich  (Scott  &  Markovich  1993)  per¬ 
forms  an  operation  on  instances  of  a  class  of  objects, 
to  determine  whether  it  has  different  effects  on  differ¬ 
ent  members  of  the  class.  This  enables  it  to  discover 
discriminating  characteristics  withir  the  class. 

Some  exploratory  learning  systems,  such  as  Raja^ 
money’s  systems  (l^jamoney  1993),  invest  significant 
effort  to  design  experiments  that  provide  the  maximum 
amount  of  information.  This  is  necessary  because  ex¬ 
periments  can  be  costly  and  can  have  persistent  effects 
on  the  environment.  Debrief’s  chunking-based  tech¬ 
nique  filters  out  irrelevant  experiments  automatically, 
without  ngnificant  effort.  Side  events  on  the  environ¬ 
ment  are  not  a  concern  during  mental  simulation. 

Explaining  Beliefs 

Explaining  beliefs,  e.g.,  that  ROE  was  achieved,  in¬ 
volves  many  of  the  same  analysis  steps  used  for  ex¬ 
plaining  decisions.  Debrief  sta^  by  searching  mem¬ 
ory  for  the  nearest  preceding  state  in  which  the  belief 
came  to  be  held.  It  determines  what  operator  was 
Ixung  applied  during  that  state,  and  uses  Establish- 
AppUcability  if  necessary  to  make  sure  that  the  opera¬ 
tor  iq>plies  in  the  recall^  state.  If  the  belief  had  to  be 
retracted  in  order  to  make  Test-Operator- Applicability 
succeed,  then  the  operator  was  responsible  for  assert¬ 
ing  the  belief.  Such  is  the  case  for  the  belief  that  ROE 
is  achieved,  which  is  asserted  by  an.  operator  named 
ROE-Achieved.  Otherwise,  Debrief  would  remove  the 
belief  and  attempt  mental  simulation  again;  if  the  be¬ 
lief  is  asserted  in  the  course  of  applying  the  operator, 
the  operator  is  probably  responsible  for  the  belief. 

Summary  of  the  Effects  of  Learning 

Learning  via  chunking  takes  place  throughout  the  De¬ 
brief  system.  The  following  is  a  summary  of  the  differ¬ 
ent  types  of  chunks  that  are  produced: 


•  Episodic  memory  recognition  chunks:  event  -f-  at¬ 
tribute  value  recognition; 

•  State  recall  chunks:  event  — »  state; 

•  Mental  simulation  chunks:  event  problem  space 
-I-  state  — »  applicable  or  inapplicable  -t-  alternative 
operator; 

•  Applicability  analysis  chunks:  event  -f  problem 
space  -h  state  — *  significant  state  attribute;  event 
-h  problem  space  +  state  — >  alternative  operator  -i- 
alternative  state; 

•  Natural  language  generation  chunks:  case  frame  — * 
list  of  words;  content  description  — »  list  of  utter¬ 
ances; 

•  Presentation  chunks:  content  description  -f-  user 
model  —*  utterances  +  media  control  commands  -f 
user  model  updates. 

The  presentation  mechanisms  that  yield  the  latter  two 
types  of  chunks  are  described  in  (Johnson  1994).  Alto¬ 
gether,  these  chunks  enable  Debrief  to  acquire  rignif- 
icant  facility  in  explaining  problem  solving  behavior. 
These  chunks  result  in  speedups  during  the  course  of 
explaining  a  single  mission,  l^ture  experiments  will 
determine  the  transfer  effects  between  missions. 

Evaluation  and  Status 

The  implementation  of  Debrief  comprises  over  1700 
productions;  in  a  typical  session  these  are  augmented 
by  between  500  and  1000  chunks.  Debrief  currently  can 
describe  and/or  explain  a  total  of  66  types  of  events  in 
the  tactical  air  domain.  Its  natural  language  gener¬ 
ation  component  has  a  vocabulary  of  259  wor^  and 
phrases.  Debrief  can  explain  a  range  of  one-on-one 
and  one-on-two  air-to-mr  engagements. 

Formative  evaluations  of  Debrief  explanations  have 
been  performed  with  US  Naval  Reserve  fighter  pi¬ 
lots.  These  evaluations  confirmed  that  explanations 
ate  extremely  helpful  for  validating  the  agent’s  per¬ 
formance,  and  building  confidence  in  it.  They  also 
underscored  the  importance  of  having  the  agent  jus- 
tity  its  belieb — the  evaluators  frequently  want^  to  ask 
questions  about  assertions  made  by  Debrief  during  the 
course  of  the  explanation.  This  motivated  the  de^op- 
ment  of  support  for  the  Elxplun-Belief  question  type. 
There  was  immediate  interest  on  the  part  of  the  sub¬ 
ject  matter  experts  in  using  Debrief  to  understand  and 
validate  the  b^avior  of  TacAir-Soar  agents. 

The  weakest  point  of  the  current  system  is  its  natu¬ 
ral  language  generation  capability.  However,  this  was 
found  not  to  be  a  major  concern  for  the  evaluators. 
Their  primary  interest  was  in  understanding  the  think¬ 
ing  processes  of  TacAir-Soar,  and  to  the  extent  that 
Debrief  made  that  reasoni  i  apparent  it  was  conrid- 
ered  effective. 
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'  Conclusion 

This  paper  has  described  a  domun-independent  tech¬ 
nique  for  analysing  the  reasoning  processes  of  an  in- 
tdligent  agent  in  order  to  support  explanation.  This 
technique  reduces  the  need  for  extensive  knowledge  ac¬ 
quisition  and  special  architectures  in  support  of  expla¬ 
nation.  Instead,  the  agent  can  construct  explanations 
on  its  own.  Learning  plays  a  crucial  role  in  tl^  process. 
Next  st^  include  extending  the  range  to  questions 
that  can  be  answered,  improving  the  natural  language 
generation,  and  makii^gKater  use  of  multi-media  pre¬ 
sentations.  There  is  interest  in  using  the  mental  simu¬ 
lation  framework  described  here  to  improve  the  agent's 
problem  solving  performance,  by  discovering  alterna¬ 
tive  decirion  choices  with  improved  outcomes. 
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Abstract 

One  of  the  most  important  tasks  in  a 
tactical  engagement  is  to  maintain  aware¬ 
ness  of  the  current  situation.  This  is  as 
true  for  simulated  intelligent  agents  as  it 
is  for  humans  in  real  engagements.  IVe 
have  identified  two  key  capabilities  that 
are  required  for  maintaining  situational 
awareness:  managing  and  synthesizing 
information  from  a  variety  of  informa¬ 
tion  sources,  and  correctly  identifying  and 
sorting  engagement  participants  into  an 
appropriate  mental  representation.  This 
paper  discusses  our  efforts  in  addressing 
these  capabilities  within  the  TacAir-Soar 
system. 

An  intelligent  simulated  agent  must  be 
able  to  observe  and  interpret  the  world 
it  is  operating  in.  This  includes  observ¬ 
ing  how  the  world  reacts  to  behaviors 
generated  by  the  agent,  as  well  as  behav¬ 
iors  generated  by  other  agents  within  the 
simulation.  As  the  world  changes,  the 
agent  must  continuously  build  and  main¬ 
tain  a  ‘‘mental  picture”  of  the  world’s 
current  state  (i.e.,  maintain  situational 
awareness).  Otherwise  there  is  no  hope 
of  generating  appropriate  behaviors  to 


accomplish  the  agent’s  goals.  In  order  to 
maintain  such  a  picture,  the  agent  should 
use  whatever  information  sources  it  has 
available.  In  general,  more  information 
sources  are  better,  but  having  multiple 
sources  demands  that  the  agent  be  able 
to  synthesize  tl.e  different  types  of  infor¬ 
mation  in  order  to  form  a  representation 
of  the  world  that  is  as  complete  and  cor¬ 
rect  as  possible. 

In  earlier  work  (Jones,  Tambe,  Laird,  & 
Rosenbloom,  1993)  we  concentrated  on 
building  agents  that  generated  reason¬ 
able  behavior  given  rather  strong  assump¬ 
tions  about  world  information.  The  past 
TacAir-Soar  agent  assumed  that  there 
were  at  most  two  agents  operating  in  the 
simulated  tactical  air  environment:  the 
agent  itself  and  one  potential  enemy.  In 
addition,  the  agent  had  only  two  sources 
of  information:  cockpit  controls  reported 
information  about  the  agent’s  vehicle  and 
weapons,  and  a  radar  reported  informa¬ 
tion  about  the  other  participant  in  the 
environment. 

This  arrangement  allowed  the  system  to 
generate  some  tactical  behaviors,  but  it 
greatly  limited  the  types  of  situations  in 
which  TacAir-Soar  could  function.  More 
typically,  a  tactical  air  agent  finds  itself 
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in  situations  similar  to  that  shown  in  Fig¬ 
ure  1.  There  can  be  a  number  of  partic¬ 
ipants  in  the  engagement,  and  a  number 
of  ways  to  gather  information  about  these 
agents. 

Thus,  in  our  current  work,  we  have  ex¬ 
panded  the  abilities  of  the  TacAir-Soar 
agent  to  manage  information.  The  cur¬ 
rent  agent  is  able  to  maintain  mental 
representations  of  s^ny  number  of  other 
participants  in  the  simulation.^  In  addi¬ 
tion,  the  agent  now  synthesizes  informa¬ 
tion  from  a  number  of  different  sources. 
Each  agent  receives  information  visu¬ 
ally,  from  its  radar,  and  via  radio  from 
other  participants  in  the  engagement. 
These  increases  in  capabilities  are  nec¬ 
essary  in  order  for  TacAir-Soar  to  func¬ 
tion  reasonably  in  the  complex  domain 
of  tactical  flight.  However,  they  also  in¬ 
troduce  a  number  of  complexities  to  the 
task  of  maintaining  situational  aware¬ 
ness,  or  keeping  a  mental  picture  of  what 
is  happening  in  the  world.  We  feel  that 
maintaining  situational  awareness  boils 
down  to  two  cognitive  capabilities:  man¬ 
ning  information  from  multiple  sources 
and  managing  information  about  multiple 
participants  in  an  engagement.  This  pa¬ 
per  discusses  our  approach  to  addressing 
these  two  broad  issues  within  the  TacAir- 
Soar  system. 

Managing  multiple  information 
sources 

Besides  receiving  information  from  its 
own  vehicle’s  instruments  and  gauges,  the 


^  In  theory  this  number  is  unbotmded,  but 
in  practice  agent  performance  can  degrade 
dramatically  when  it  has  too  many  other 
agents  to  pay  attention  to. 


current  version  of  TacAir-Soar  receives 
information  about  other  participants 
from  three  basic  sources.  The  agent  may 
receive  information  from  a  visual  con¬ 
tact  with  another  simulation  participant 
such  as  another  airplane  (this  informa¬ 
tion  comes  in  through  a  DIS  visual  object 
package).  The  agent  may  achieve  a  radar 
contact  with  a  participant.  Finally,  the 
agent  may  receive  communicated  infor¬ 
mation  about  another  participant  (this 
information  may  come  from  a  ground 
controller,  an  air  controller,  or  perhaps 
a  section  or  division  partner).  In  addi¬ 
tion,  TacAir-Soar  periodically  records 
position  information  for  current  contacts. 
Thus,  when  no  active  (visual,  radar,  or 
communication)  contact  information  is 
available,  the  agent’s  memory  becomes  a 
fourth  source  of  information. 

When  there  is  only  one  active  informa¬ 
tion  source,  things  are  relatively  simple. 
The  system  simply  uses  the  information 
available  to  track  the  contact.  This  may 
not  always  be  the  best  or  most  up-to-date 
information,  but  the  system  can  only 
make  do  with  what  it  has.  When  there 
axe  multiple  active  information  sources 
describing  contact  with  a  pzu’ticular  par¬ 
ticipant,  difflculties  may  arise.  In  this 
case,  there  is  a  decision  to  be  made  about 
where  to  look  for  the  correct  information. 
Some  types  of  information  are  only  avail¬ 
able  from  particular  types  of  sources,  but 
others  are  provided  by  all  of  the  infor¬ 
mation  sources.  For  example,  both  radar 
and  visual  information  can  provide  the 
relative  position  of  another  airplane,  but 
radar  can  provide  a  more  accurate  mea¬ 
surement  of  the  airplane’s  altitude  and 
speed.  In  addition,  information  sources 
have  different  update  rates,  so  some  may 
contain  “stale”  information  at  certain 
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Figure  1.  A  typical  engagement  involving  multiple  participants  and  multiple 
sources  of  information. 
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times.  For  example,  visual  information 
has  an  almost  instantaneous  update  rate, 
radar  information  depends  on  the  speed 
of  the  actual  radar  sensors,  and  commu¬ 
nicated  information  is  updated  relatively 
slowly.  TacAir-Soar  currently  prioritizes 
its  information  sources  by  assuming  that 
visual  information  is  generally  better 
than  radar,  radar  is  generally  better  than 
communication,  and  communication  is 
generally  better  than  memorized  infor¬ 
mation.  In  addition,  the  system  ranks 
current  information  by  remembering  how 
long  it  has  been  since  the  information 
was  last  updated.  When  it  wants  to  look 
up  information  for  a  particular  partici¬ 
pant,  it  uses  information  from  the  best 
existing  source. 

Another  issue  involves  what  actions 
TacAir-Soar  should  take  in  order  to  gaun 
new  information  about  a  participant.  In 
the  older  version  of  the  system  this  was 
a  simple  matter  because  there  was  only 
a  radar  information  source.  If  the  sys¬ 
tem  did  not  have  radar  contact,  it  did 
what  it  could  to  achieve  a  radar  contact. 
Now,  however,  there  are  many  different 
information  sources  and  different  ways  to 
achieve  them.  In  addition,  some  sources 
are  better  than  others  in  different  sit¬ 
uations.  For  example,  radar  is  good  at 
accurately  tracking  altitudes  and  head¬ 
ings  at  a  distance,  but  rapid  visual  infor¬ 
mation  is  necessary  as  the  engagement 
progresses. 

For  example,  if  TacAir-Soar  only  has 
recorded  information  about  an  agent,  the 
system  might  request  some  communi¬ 
cated  information  in  order  to  achieve  a 
better  idea  of  what  the  agent  is  doing. 
Communicated  information  is  useful  (par¬ 
ticularly  at  long  ranges),  but  it  is  some¬ 


what  inaccurate  and  takes  a  while  to  re¬ 
port.  Thus,  the  system  still  does  what 
it  can  to  get  a  radar  or  visual  contact  in 
order  to  get  faster,  more  reliable  informa¬ 
tion.  The  point  here  is  that  TacAir-Soar 
not  only  requires  knowledge  for  managing 
information  from  multiple  sources,  but 
it  also  must  have  the  knowledge  to  seek 
out  different  types  of  information  contacts 
when  appropriate. 

Identifying  and  tracking  multiple 
psorticipants 

In  the  tactical  air  domain,  there  are 
generally  a  number  of  participants  in 
each  engagement.  A  particular  simu¬ 
lated  agent  will  probably  have  a  section 
partner,  and  there  may  be  any  number 
of  other  friendly  and  hostile  participants 
that  the  agent  must  worry  about.  The 
major  difficulty  arises  in  creating  a  map¬ 
ping  between  the  participants  that  “are 
really  out  there”  and  the  participants 
that  the  agent  is  currently  receiving  in¬ 
formation  about  (from  at  least  one  of  the 
information  sources).  Thus,  much  of  our 
research  effort  has  been  on  finding  an  ef¬ 
ficient,  accurate,  and  realistic  method  to 
maintain  this  mapping. 

The  problem  can  be  summarized  as  fol¬ 
lows.  The  agent  may  have  a  mental  rep¬ 
resentation  of  a  number  of  other  partic¬ 
ipants  in  the  current  simulation  (we  will 
refer  to  these  as  mental  agents).  Now  the 
agent  receives  a  new  contact  (i.e.,  new 
visual,  radar,  or  conununicated  informa¬ 
tion  becomes  available).  The  agent  must 
now  decide  whether  this  new  contact  cor¬ 
responds  to  one  of  the  mental  agents,  or 
whether  this  is  a  new  participant  (requir¬ 
ing  the  creation  of  a  new  mental  agent), 
ff  the  contact  corresponds  to  one  of  the 
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existing  mental  agents,  TacAir-Soar  must 
decide  which  mental  agent  the  informa¬ 
tion  pertauns  to.  Only  after  this  map¬ 
ping  has  been  completed  can  the  system 
correctly  interpret  and  respond  to  the 
new  information.  This  should  be  done  as 
quickly  as  possible,  but  it  should  also  be 
done  with  the  same  intelligence  and  flex¬ 
ibility  that  human  pilots  have.  It  can  be 
disastrous,  for  example,  to  conclude  by 
mistake  that  a  hostile  participant  is  the 
agent’s  section  partner. 

A  similar  problem  arises  in  the  case 
where  two  agents  must  communicate  with 
each  other  about  other  participants  in 
the  engagement.  For  example,  the  lead 
agent  of  a  flight  section  may  need  to  tell 
its  partner  which  bogey  it  is  targeting. 
However,  the  two  agents  will  not  neces¬ 
sarily  have  the  same  mental  agents  repre¬ 
sented,  and  often  they  will  not  even  have 
the  same  information  coming  in  on  their 
sensors.  The  solution  for  this  is  for  the 
lead  to  describe  particular  characteris¬ 
tics  of  the  bogey,  so  the  partner  can  use 
this  information  to  determine  which  men¬ 
tal  agent  the  lead  is  talking  about.  In 
TacAir-Soar,  the  problem  of  communicat¬ 
ing  about  other  engagement  participants 
is  subsumed  by  the  general  problem  of 
identifying  and  sorting  incoming  informa¬ 
tion  (r^ardless  of  the  particular  infor¬ 
mation  source)  to  the  appropriate  mental 
agent. 

TacAir-Soar  solves  this  problem  by  pass¬ 
ing  new  information  through  a  set  of  fil¬ 
ters.  The  first  filter  determines  whether 
the  new  information  closely  matdbes  any 
existing  contact  information  for  a  men¬ 
tal  agent  (e.g.,  the  system  might  achieve 
radar  contact  with  an  agent  for  which 
it  had  only  previously  received  conunu- 


nicated  information).  If  this  filter  fails 
to  identify  a  unique  mental  agent,  the 
next  filter  compares  any  new  position 
information  to  the  last  position  informa¬ 
tion  the  system  recorded  for  each  mental 
agent.  TacAir-Soar  uses  a  form  of  tem¬ 
poral  reasoning,  based  on  the  time  of  the 
last  recorded  position  for  each  mental 
agent,  together  with  the  contact’s  head¬ 
ing,  speed,  etc.,  to  determine  which  men¬ 
tal  agents  the  new  contact  information 
could  possibly  pertain  to. 

This  filter  may  rule  out  any  existing 
mental  agents,  in  which  case  TacAir-Soar 
will  create  a  new  one.  On  the  other  hand, 
the  filter  may  provide  a  unique  mental 
agent  to  assign  the  new  contact  infor¬ 
mation  to.  Otherwise,  there  is  still  some 
ambiguity  so  the  system  must  use  its  final 
filter.  This  filter  compares  individual  fea¬ 
tures  in  the  new  contact  information  to 
the  same  features  in  each  remaining  can¬ 
didate  mental  agent.  The  mental  agents 
that  are  closest  in  value  for  a  chosen  fea¬ 
ture  are  saved,  while  others  are  elimi¬ 
nated  from  consideration.  This  process 
continues  through  a  set  of  features  un¬ 
til  a  unique  mental  agent  remains.  Cur¬ 
rently,  the  features  that  TacAir-Soar  ex¬ 
amines  are  magnetic  bearing,  range,  alti¬ 
tude,  speed,  and  heading.  If,  after  sort¬ 
ing  through  all  of  these  features,  there 
is  still  more  than  one  candidate  mental 
agent,  the  system  simply  chooses  one  at 
random.  However,  this  is  rare  unless  two 
contacts  appear  in  almost  the  same  posi¬ 
tion,  in  which  case  further  discrimination 
is  probably  meaningless  anyway. 

This  filtering  mechanism  is  based  on 
the  methods  that  real  Navy  pilots  and 
RIOs  use  to  identify  contacts,  and  it  has 
proved  relatively  robust  in  allowing  the 
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TacAir-Soar  agent  to  reason  about  mul¬ 
tiple  participants  in  a  simulated  engage¬ 
ment.  However,  there  are  times  when  the 
current  mechanism  fuls,  indicating  that 
there  is  some  knowledge  missing  from 
the  process.  For  example,  human  pilots 
generally  be^n  a  mission  with  an  idea  of 
where  the  Mendly  and  enemy  forces  are, 
and  this  helps  them  identify  initial  con¬ 
tacts.  Additional  information  sources, 
such  as  IFF,  can  also  be  used  to  help 
identify  and  sort  contacts.  Within  visual 
range,  pilots  can  use  the  actual  shapes 
of  different  vehicle  types  to  determine 
who  is  wb  far  TacAir-Soar  does  not 
use  these  a  .tional  types  of  knowledge, 
and  so  it  is  prone  to  getting  confused  in 
some  situations  where  humans  do  not 
have  difficulties  maintaining  situational 
awareness.  TacAir-Soar  can  also  become 
confused  when  engagements  become  fast 
and  close,  so  it  does  not  have  time  to  sort 
and  process  all  of  the  incoming  informa¬ 
tion  properly.  However,  this  is  the  type 
of  situation  that  is  difficult  even  for  hu¬ 
man  experts. 

Summary 

Maintaining  situational  awareness  is  a 
particularly  important  part  of  tactical 
bdiavior,  and  simulated  tactical  agents 
must  address  the  issues  involved.  We 
have  identified  two  important  compo¬ 
nents  of  maintaining  situational  aware¬ 
ness:  mana^ng  knowledge  about  multiple 
tactical  participants  in  an  engagement, 
and  managing  incoming  information  from 
a  variety  of  sources.  In  addition,  we  have 
implemented  knowledge  and  behaviors 
that  address  these  issues  into  the  TacAir- 
Soar  system. 

In  order  to  reason  about  multiple  infor¬ 


mation  sources,  the  system  has  mech¬ 
anisms  for  choosing  between  existing 
sources,  as  well  as  methods  for  generat¬ 
ing  behavior  so  that  the  agent  can  ac¬ 
quire  new  information  (such  as  searching 
for  radar  contacts  or  moving  into  visual 
range).  Reasoning  about  multiple  partici¬ 
pants  requires  the  agent  to  form  a  mental 
picture  of  its  situation,  including  a  men¬ 
tal  representation  of  each  participant  in 
the  engagement.  As  new  information  is 
acquired,  the  system  uses  heuristics  to 
determine  to  which  mental  agent  each 
new  contact  pertdns.  In  addition,  the 
agent  performs  these  tasks  within  the  dy¬ 
namic  constraints  of  the  domain,  so  it  is 
possible  for  it  to  get  confused  in  the  same 
types  of  situations  as  humans. 

Our  continuing  work  will  focus  on  the 
addition  of  new  information  sources  such 
as  IFF  and  radar-warning  receivers.  To¬ 
gether  with  these  devices,  the  agent  will 
require  the  knowledge  to  gather  and  man¬ 
age  the  types  of  information  these  devices 
provide  in  the  appropriate  situations.  In 
addition,  we  are  continuing  to  study  how 
human  pilots  maintsun  knowledge  about 
other  participants  in  an  engagement,  so 
that  we  can  improve  the  mechanisms  for 
identifying  and  sorting  contacts  into  men¬ 
tal  agent  representations.  As  this  knowl¬ 
edge  improves,  we  expect  to  develop  gen¬ 
eral  intelligent  methods  for  maintaining 
situational  awareness,  so  the  agent  can 
generate  even  more  realistic  and  appro¬ 
priate  behavior. 
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Abstract 

The  domains  that  computer-generated 
forces  address  (such  as  tactical  flight)  are 
more  complex  than  have  generally  been 
used  in  artificial-intelligence  research.  A 
particular  characteristic  of  this  complexity 
is  that  a  reasonable  agent  must  attend  to 
a  large  number  of  goals  at  the  same  time. 
Moreover,  some  of  these  goals  are  inde¬ 
pendent,  while  others  interact  with  each 
other  in  a  variety  of  ways.  This  research 
focuses  on  a  number  of  issues  involved  in 
representing,  reasoning  about,  and  learn¬ 
ing  about  such  complex  goal  structures. 

We  discuss  a  number  of  approaches  that 
we  have  examined  within  the  framework  of 
the  TacAir-Soar  system. 

The  Soar-IFOR  project  aims  to  biuld 
believable  agents  for  tactical  aur  simu¬ 
lation.  We  have  constructed  a  system, 
called  TacAir-Soar,  that  embodies  a  large 
amount  of  knowledge  for  carrying  out 
tactical  naval  air  missions  (Jones,  Tambe, 
Laird,  &  Rosenbloom,  1993;  Rosenbloom 
et  al,,  1994).  In  the  course  of  our  re¬ 
search,  we  have  developed  a  large  ontol¬ 
ogy  of  the  knowledge  required  to  generate 
human-like  behavior  in  flight  simulation. 
This  includes  knowledge  about  mission 
goals,  doctrine,  equipment  specifications, 
survival,  situational  awareness  and  inter¬ 
pretation,  cooperation,  and  other  aspects 
of  the  task.  Although  each  of  these  types 
of  knowledge  is  relatively  independent, 
their  impact  on  behavior  is  highly  inter¬ 
dependent. 


This  paper  investigates  various  repre¬ 
sentations  for  sets  of  interacting  goals 
that  arise  from  such  a  complex  knowl¬ 
edge  base.  We  have  identified  five  issues 
that  we  wish  to  address  in  our  examina¬ 
tion  of  the  candidate  approaches.  First, 
it  appears  to  be  necessary  to  represent 
agent  goals  as  a  forest  of  interacting  goal 
hierarchies.  Second,  existing  goal-driven 
systems  are  not  designed  for  such  a  goal 
representation,  so  we  must  find  an  ap¬ 
propriate  mapping  between  agent  goals 
and  the  types  of  goals  that  current  ar¬ 
chitectures  for  intelligence  allow  (e.g., 
we  want  the  architecture  to  do  as  much 
maintenance  of  goals  as  possible).  Third, 
the  agent  must  reason  about  how  well 
different  actions  achieve  combinations 
of  goals.  Fourth,  the  ideal  knowledge 
representation  should  facilitate  effective 
learning  within  the  architecture.  Finally, 
the  representation  should  also  allow  the 
knowledge  base  to  be  updated  by  subject- 
matter  experts  and  knowledge  engineers 
with  a  minimum  of  effort. 

An  example  from  the  tactical  flight 
domain 

To  illustrate  this  complexity  of  knowl¬ 
edge,  consider  a  situation  where  an  F14 
pilot  has  just  launched  a  medium-range, 
radar-guided  missile.  At  this  point,  the 
pilot  has  a  number  of  active  goals,  such 
as  surviving,  accomplishing  a  specified 
mission,  destroying  the  target,  achieving 
another  missile  shot,  maintaining  situ¬ 
ational  awareness,  and  supporting  the 
launched  missile.  A  subset  of  these  goals 
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appears  in  Figure  1.  Some  of  these  goals 
have  a  direct  hierarchical  relationship 
(e.g.,  intercepting  a  target  and  achiev¬ 
ing  proximity  to  it),  while  others  are  rel¬ 
atively  independent  of  each  other  (e.g., 
achieving  proximity  to  a  target  and  em¬ 
ploying  weapons).  In  response  to  this 
host  of  goals,  there  are  a  number  of  can¬ 
didate  actions  the  pilot  could  consider. 
However,  these  goals  constrain  and  some¬ 
times  even  conflict  with  each  other,  so 
it  does  not  always  suflKce  for  the  pilot 
to  select  an  action  that  addresses  only  a 
subset  of  his  or  her  goals. 

In  this  case,  the  pilot  may  wish  to  de¬ 
crease  closing  velocity  to  the  target  in  or¬ 
der  to  increase  chances  of  survival  and  to 
achieve  another  missile  shot.  One  possi¬ 
ble  action  would  be  to  turn  away  from 
the  target,  but  this  would  violate  the 
goals  of  maintaining  a  radar  lock  and 
supporting  the  launched  missile  because 
the  pilot’s  radar  would  no  longer  be  il- 
liuninating  the  target.  Another  option 
would  be  to  reduce  speed  by  reducing 
thrust.  This  has  the  tradeoff  of  reducing 
the  Fl4’s  energy,  which  could  become  im¬ 
portant  later  in  the  engagement.  Other 
possible  actions  would  be  to  reduce  speed 
by  gaining  altitude,  or  reduce  closing 
velocity  by  turning  part  way  away  from 
the  target  (as  in  an  “f-pole”  maneuver). 
The  amount  of  altitude  change  or  f-pole 
turn  would  depend  on  other  aspects  of 
the  current  situation,  such  as  the  gimbal 
limits  of  the  radar. 

Issues  for  constructing  an  intelli¬ 
gent  agent 

Our  approach  to  simulation  is  to  apply 
state-of-the-art  artifidal-intelligence  (AI) 
technology  to  create  individual  intelligent 


participants  for  simulated  engagements. 
Unfortunately,  existing  AI  systems  that 
generate  behavior  are  not  well  suited  to 
the  demands  of  knowledge-rich  tasks  with 
interacting  goals.  In  general,  AI  systems 
only  focus  on  one  goal  at  a  time,  or  at 
best  allow  a  single  hierarchy  of  simulta¬ 
neous  goals.  However,  some  of  the  goals 
in  the  current  domain  are  hierarchical  in 
nature,  while  others  clearly  are  not.  Even 
the  non-hierarchical  goals  interact  and 
must  be  taken  into  account  when  generat¬ 
ing  behavior.  In  essence,  it  appears  that 
the  best  representation  of  goal  knowledge 
for  this  domain  consists  of  a  set  of  inter¬ 
acting  goal  hierarchies. 

This  is  not  to  say  there  has  been  no  re¬ 
search  on  planning  to  address  unordered 
interacting  goals.  For  example,  Chapmain 
(1987)  presents  a  complete  and  correct 
planning  method  for  arbitrary  goal  com¬ 
binations,  but  it  works  in  restricted  do¬ 
mains,  and  it  relies  on  search-intensive 
planning,  rather  than  real-time  behavior 
generation.  Cohen,  Greenberg,  Hart,  and 
Howe  (1989)  and  Veloso  (1989)  have  sug¬ 
gested  methods  for  conjunctive  goal  plan¬ 
ning  in  real  time  by  storing  preplzmned 
episodes  or  using  intelligent  heuristic 
search.  These  are  alternatives  to  the  ap¬ 
proach  presented  here,  and  we  plan  to 
examine  the  tradeoffs  between  various 
approaches  in  the  future. 

Rather  than  committing  to  a  single  po¬ 
tential  solution,  we  are  evaluating  a  num¬ 
ber  of  different  approaches  both  to  the 
representation  of  goals  within  an  agent, 
and  its  mechanisms  for  reasoning  about 
interactions  between  goals.  All  of  our  ef¬ 
forts  have  been  developed  with  variations 
of  the  TacAir-Soar  agent  (Jones  et  al., 
1993;  Rosenbloom  et  al.,  1994),  which  is 
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Figuie  1.  A  subset  of  concunent,  interacting  goals  in  the  tactical  air  domain. 
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implemented  within  the  Soar  architecture 
for  <x>gnition  (Rosenbloom,  Laird,  Newell, 
&;  McCarl,  1991). 

Mapping  agent  goals  to  architectural 
goals 

As  the  previous  section  illustrated,  agent 
goals  are  best  represented  as  a  set  of  goal 
hierarchies.  However,  traditional  AI  sys¬ 
tems  do  not  encourage  this  type  of  rep¬ 
resentation.  For  example,  architectures 
such  as  Soar,  Prodigy  (Minton  et  al, 

1989),  and  Theo  (Mitchell  et  al.,  1991) 
make  goals  a  first-class  object  type,  and 
they  include  specific  mechanisms  for  rep¬ 
resenting,  posting,  and  learning  about 
goals.  But  these  goals  can  only  be  ex¬ 
pressed  easily  in  a  single  stack  or  hierar¬ 
chy.  To  overcome  this  limitation,  alter¬ 
native  goal  representations  can  be  used, 
such  as  encoding  goals  as  part  of  the 
agent’s  current  state  description.  How¬ 
ever,  this  type  of  representation  precludes 
using  many  of  the  mechanisms  the  archi¬ 
tecture  provides  directly  to  support  goals. 
Thus,  we  are  left  with  the  question  of 
how  the  agent  goals  we  wish  to  represent 
can  or  should  be  mapped  to  architectural 
goals  that  the  overall  system  supports. 

The  initial  design  of  TacAir-Soar  takes  a 
mixed  approach  to  the  mapping  between 
agent  and  architectural  goals.  Some  of 
the  agent  goals  are  represented  explic¬ 
itly  as  architectural  goals,  whereas  others 
appear  as  implicit  goals  in  the  agent’s  sit¬ 
uation  representation.  The  explicit  goals 
map  directly  onto  Soar’s  goal  stack,  and 
they  benefit  from  Soar’s  goal  maintenance 
and  learning  mechanisms.  In  contrast, 
implicit  goals  are  recorded  along  with 
other  descriptions  of  the  agent’s  current 
state,  such  as  it’s  vehicle  type,  current 


speed,  missile  status,  etc.  Some  implicit 
goals  (e.g.,  survival)  are  not  represented 
at  all,  and  are  simply  assumed  to  exist, 
and  the  behavior-generation  rules  take 
them  into  account  even  though  they  are 
not  represented  explicitly.  In  the  above 
example,  survival,  maintaining  situational 
awareness,  and  decreasing  closure  are  im¬ 
plicit  goals,  whereas  destro3dng  the  target 
and  supporting  the  missile  are  mapped  to 
architectural  goals.  In  many  ways,  this  is 
an  ad  hoc  solution,  but  it  allows  the  sys¬ 
tem  to  generate  reasonable  behavior  by 
using  architectural  mechanisms  to  sup¬ 
port  the  explicit  goals  while  still  allow¬ 
ing  the  implicit  goals  to  modify  behavior 
when  appropriate.  Thus,  this  represen¬ 
tation  works  well  for  generating  behav¬ 
ior,  but  difficulties  arise  when  the  sys¬ 
tem  must  learn  to  adapt  that  behavior. 
For  example,  there  is  no  easy  way  for  the 
system  to  detect  that  maintaining  situa¬ 
tional  awareness  sometimes  conflicts  with 
evading  a  missile. 

In  response  to  this  problem,  we  have 
investigated  two  alternative  approaches 
to  mapping  goals.  In  one  approach,  all 
agent  goals  are  mapped  into  the  archi¬ 
tectural  goal  hierarchy,  collapsing  the 
agent’s  forest  of  goals  into  a  single  stack. 
The  alternative  approach  is  to  map  none 
of  the  agent  goals  into  the  architectural 
goal  hierarchy.  In  this  case,  all  reason¬ 
ing  takes  plztce  in  the  service  of  a  single 
architectural  goal,  and  all  other  goals  ap¬ 
pear  as  descriptions  of  the  agent’s  cur¬ 
rent  situation.  There  are  a  number  of 
tradeoffs  between  these  two  approaches 
involving  the  automatic  mechanisms  for 
maintaining  a  goal  stack  and  learning, 
and  the  flexibility  of  the  representation  of 
goals  and  knowledge  about  goals  in  terms 
of  expressive  power  and  ease  of  mainte- 


nance.  For  example,  when  mapping  all 
agent  goals  to  architectural  goals,  the 
current  forest  of  goals  must  be  collapsed 
into  a  single  hierarchy.  This  new  hier¬ 
archy  dynamically  imposes  a  syntactic 
parent-child  relationship  on  some  goals 
even  when  such  a  relationship  does  not 
exist  semantically.  For  example,  evad¬ 
ing  a  missile  might  be  assigned  as  a  child 
of  employing  weapons,  even  though  the 
goals  do  not  really  depend  on  each  other. 

This  resulting  hierarchy  represents  a 
single  total  ordering  on  the  normally 
partially  ordered  goals,  which  can  lead 
to  difficulties  in  maintaining  the  goal 
stack.  In  the  above  example,  if  the  goal 
to  employ  weapons  goes  away,  the  goal 
to  evade  a  missile  will  also  be  popped 
from  the  stack,  because  it  was  arbitrar¬ 
ily  set  up  as  a  child  of  the  goal  to  employ 
weapons.  On  the  other  hand,  because 
all  of  the  goals  are  mapped  to  architec¬ 
tural  goals,  this  version  of  the  system  can 
take  better  advantage  of  built  in  mech¬ 
anisms  for  detecting  and  implementing 
learning  opportunities.  The  architectures 
for  intelligence  that  we  have  mentioned 
generally  learn  about  relationships  across 
architectural  goals,  but  not  within  archi¬ 
tectural  goals.  If  all  the  reasoning  takes 
place  within  a  single  architectural  goal, 
no  learning  can  take  place. 

Our  experiences  with  various  represen¬ 
tations  for  goals  have  also  led  us  to  con¬ 
sider  alternatives  for  expanding  archir 
tectures  such  as  Soar,  so  that  it  can  ex¬ 
plicitly  represent  sets  of  goal  hierarchies 
rather  than  just  a  single  hierarchy.  If  this 
effort  is  successful,  it  should  provide  us 
with  all  of  the  advantages  of  both  of  the 
extreme  approaches  mentioned  above,  be¬ 
cause  all  agent  goals  would  map  directly 


to  architectural  goals  in  a  simple  manner. 

Reasoning  about  interactions 

In  addition  to  an  appropriate  represen¬ 
tation  for  gozds,  the  agent  must  contain 
mechanisms  for  reasoning  about  the  way 
goads  influence  each  other.  There  are  two 
general  cases  that  we  consider  here.  Two 
goals  interact  when  they  can  be  achieved 
or  mcuntained  concurrently,  but  they  each 
constrain  the  behaviors  that  are  appropri¬ 
ate.  For  example,  in  Figure  1,  the  agent 
can  reduce  closing  velocity  to  its  target 
while  maintaining  a  radar  lock  by  turn¬ 
ing  just  until  the  target  is  on  the  edge  of 
the  radau'.  Different  behaviors  would  be 
appropriate  if  these  goals  were  being  ad¬ 
dressed  independently.  In  contrast,  some 
goals  axe  simply  impossible  to  achieve  or 
mauntain  at  the  saune  time.  In  this  case, 
we  say  the  goals  conflict  with  each  other. 
Again  referring  to  Figure  1,  the  agent 
cannot  always  maintaiin  a  raulax  lock  if 
it  is  busy  evading  a  missile.  Thus  one  or 
the  other  goal  must  be  suspended  tem¬ 
porarily  or  ignored  completely. 

Each  of  the  system  variations  we  have 
explored  addresses  goal  interactions  and 
conflicts.  In  one  of  our  approaches,  in¬ 
teractions  between  goads  are  represented 
implicitly  within  the  proposal  condi¬ 
tions  for  actions.  For  example,  an  agent 
might  propose  the  action  of  maintain¬ 
ing  radar  lock  on  a  target  unless  there 
is  am  incoming  threat  that  needs  to  be 
evaded.  An  alternative  approach  involves 
explicitly  representing  the  interaction 
between  goads,  so  the  agent  can  reason 
about  when  to  suspend  goads  or  attend 
to  multiple  goads.  In  this  case,  the  agent 
proposes  the  goals  of  maintaining  radar 
lock  amd  evading  a  missile  independently. 


48 


and  separate  reasoning  determines  which 
set  of  actions  addresses  these  goals  in  the 
best  way.  For  example,  the  agent  may  de¬ 
cide  to  evade  because  survival  is  a  high- 
priority  goal. 

The  agent  currently  makes  these  deci¬ 
sions  with  built-in  arbitration  knowl¬ 
edge  about  which  goals  interact  with 
each  other.  However,  a  final  important 
issue  concerns  how  the  agent  would  leam 
such  knowledge  with  experience.  We  see 
a  number  of  advantages  from  the  ability 
to  identify  and  learn  about  goal  interac¬ 
tions  and  conflicts.  If  the  agent  finds  it¬ 
self  in  an  unexpected  situation,  it  should 
have  the  flexibility  to  generate  reasonable 
behavior  by  evaluating  the  effects  of  dif¬ 
ferent  actions  in  light  of  the  current  set 
of  goals.  In  addition,  the  knowledge  ac¬ 
quisition  task  can  be  made  easier  if  an 
agent  programmer  does  not  have  to  an¬ 
ticipate  all  the  interactions  and  conflicts 
that  may  arise  when  new  goals  are  added. 
Finally,  if  the  system  can  detect  interac¬ 
tions  that  human  experts  have  not  en¬ 
countered  (e.g.,  when  testing  new  types 
of  technology),  the  system  may  be  able  to 
discover  new  tactics  for  satisfying  partic¬ 
ular  sets  of  goals. 

Currently,  none  of  our  agent  implemen¬ 
tations  detect  or  learn  about  goal  inter¬ 
actions  on  their  own.  However,  in  devel¬ 
oping  alternative  framewciks  and  g02d 
representations,  we  have  identified  some 
approaches  that  may  be  useful  in  sup¬ 
plementing  TacAir-Soar  with  this  abil¬ 
ity.  As  an  example,  suppose  the  system 
knows  about  the  goals  to  evade  threats 
and  to  m^ntain  radar  lock,  but  it  has  no 
knowledge  about  how  these  goals  con¬ 
flict.  The  system’s  first  task  is  to  detect 
the  conflict.  This  occurs  when  it  proposes 


actions  to  come  to  two  different  head¬ 
ings.  This  will  cause  an  impasse  in  the 
Soar  architecture,  which  identifies  an  op¬ 
portunity  to  learn.  Next,  the  system  can 
plan  by  predicting  outcomes  of  various 
actions,  thus  deciding  which  goals  are 
more  important  to  achieve,  and  which 
can  be  suspended  temporarily.  For  exam¬ 
ple,  the  system  may  discover  by  mental 
simulation  that  it  will  be  destroyed  if  it 
does  not  evade  an  incoming  threat.  Thus, 
the  goal  to  evade  should  take  precedence. 
In  other  situations,  it  may  be  more  im¬ 
portant  to  maintain  the  radar  lock  (e.g., 
the  agent  may  have  launched  its  own  mis¬ 
sile).  Goal  interactions  will  be  handled  in 
a  manner  similar  to  goal  conflicts,  except 
the  system  will  have  to  be  supplemented 
with  extra  evaluation  knowledge  so  that 
it  can  appropriately  measure  the  partial 
satisfaction  of  multiple  goals. 

Summary 

There  are  a  number  of  important  issues 
involved  in  handling  interacting  and  con¬ 
flicting  goals  to  generate  reasonable  be¬ 
havior  in  a  complex  domain.  Perhaps 
foremost  are  the  facts  that  an  intelli¬ 
gent  system  must  be  able  to  represent 
and  reason  about  multiple  concurrent 
goal  hierarchies,  and  traditional  goal  rep¬ 
resentations  in  existing  AI  systems  are 
in2ulequate.  Given  an  appropriate  goal 
representation,  an  agent  must  also  be 
able  to  reason  effectively  about  the  pos¬ 
sible  interactions  and  conflicts  between 
goals,  producing  the  best  behavior  given 
all  the  various  constraints.  Finally,  intel¬ 
ligent  agents  must  eventually  be  able  to 
acquire  knowledge  about  interactions  and 
conflicts  automatically,  so  that  the  agent 
can  behave  flexibly  and  knowledge  from 
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subject-matter  experts  can  be  encoded 
without  getting  lost  in  tiny  details. 

We  have  experimented  with  a  variety  of 
representations  for  concurrent  goal  hier¬ 
archies,  and  attempted  to  fit  these  rep¬ 
resentations  nicely  into  an  existing  AI 
architecture.  We  have  been  successful  in 
this  effort,  but  we  have  also  discovered 
possible  opportunities  for  improving  the 
architecture  itself.  Although  we  have  not 
yet  solved  all  the  problems  with  detecting 
and  learning  about  goal  interactions,  our 
efforts  so  far  have  helped  us  identify  how 
and  where  learning  might  occur  within 
the  existing  TacAir-Soar  system.  Our 
current  efforts  involve  refining  our  evalu¬ 
ation  of  the  best  representation  for  goals 
and  implementing  our  ideas  for  learning. 

In  addition,  other  researchers  have  in¬ 
vestigated  the  issue  of  real-time  planning 
for  interacting  goals  (Cohen  et  al.,  1989; 
Veloso,  1989).  Although  this  work  makes 
slightly  different  assumptions  about  the 
demands  of  the  domain  and  real-time  be¬ 
havior  generation,  we  hope  to  evaluate 
some  of  their  ideas  in  the  context  of  the 
TacAir-Soar  system. 
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Abstract 

A  fundamental  goal  of  the  IFOR/WISSARD 
project  is  the  creation  of  autonomous,  intelligent 
agents  that  can  participate  in  computer  simula¬ 
tions  of  battle  f<»  training  and  gaming  purposes. 
The  creation  of  such  an  agent  has  many  of  the 
same  requirements  as  constructing  an  e]q>ert  sys¬ 
tem.  In  particular,  the  dedgners  face  the  enor¬ 
mous  task  of  acquiring,  encoding,  and  refining  the 
knowledge  that  defines  the  agent’s  desired  behav¬ 
ior.  The  knowledge  must  be  drawn  &om  many 
sources,  e.g.  subject  matter  experts  (SMEs), 
training  manuals  and  other  texts,  observation,  ex¬ 
perimentation,  etc.  This  is  usually  done  by  many 
people  whose  taw  matetials  must  then  be  repre¬ 
sented  as  a  coherent  specification  that  designers 
can  use  for  constructing  the  agent,  communicat¬ 
ing  among  themselves,  and,  ultimately,  describing 
the  causes  and  rationales  for  the  agent’s  behavior 
to  others.  In  any  case,  this  is  not  a  simple  task, 
but  in  the  case  of  the  TscAir-Soar  project,  the 
difiSculfy  is  increased  by  the  geographic^  distri¬ 
bution  of  the  project’s  memben.^  Our  solution  to 
this  problem  is  an  electronic,  multi-layer  hyper¬ 
text  document  called  the  TacAir-Soar  Description 
Document  (TDD)  which  is  implemented  within 
the  Nationd  Center  for  Supercomputing  Appli¬ 
cations’  (NCSA’s)  Mosaic.  This  document  allows 
its  viewers  to  obtain  information  about  the  do¬ 
main  in  plain  Ehi^ish,  about  the  agent  in  terms 
of  its  structures  and  bdaviots,  and  about  the  ac¬ 
tual  code  that  implements  the  agent. 

Introduction 

Tbe  TacAir-Soar  project  requires  information 
from  many  sources.  The  sources  used  to  date  in¬ 
clude  interviews  with  SMEs,  electronic  mail  mes¬ 
sages,  trmning  manuals,  telephone  conversations, 
and  observation  of  fighter  pilots  dmring  simulated 
oigagements.  To  be  useful,  all  of  this  information 


^For  a  description  of  the  IhcAir-Sou  project,  see 
this  volume,  [Rosenbloom94]. 


must  be  combined  into  a  single,  organized  repos¬ 
itory  that  can  be  accessed  by  all  members  of  the 
project  at  each  of  the  three  sites  involved  (Univer¬ 
sity,  of  Michigan,  University  of  Southern  Califor¬ 
nia/Information  Sciences  Institute,  and  Carnegie 
Mellon  University).  Additionally,  there  needs  to 
be  some  way  to  show  the  influences  of  various 
-pieces  of  knowledge  on  the  design  and  develop¬ 
ment  of  the  agents.  Without  this,  there  is  no  way 
for  SMEs  to  validate  the  relationship  between  the 
domain  knowledge  they  provide  and  the  agent’s 
behavior. 

Given  these  requirements,  the  NCSA’s  Mo¬ 
saic  system  was  chosen  as  the  application  within 
which  to  create  a  document  that  would  combine 
domain  knowledge  with  agent  implementation  in 
a  coherent  manner  that  could  be  accessed  across 
the  Internet.  The  TacAir-Soar  Description  Docu¬ 
ment  has  three  layers,  corresponding  to  the  three 
levels  of  specification  we  use  to  discuss  agent  be¬ 
havior.  The  top  layer  of  the  document  reflects 
the  knoioledge  level,  an  English  description  of  the 
air-to-air  combat  domain.  This  is  a  level  of  spec¬ 
ification  that  is  concerned  with  the  knowledge  of 
objects,  actions,  and  relations  in  the  domain  in¬ 
dependent  of  any  particular  computational  imple¬ 
mentation  of  that  knowledge.  Although  the  top 
layer  of  the  TDD  presents  a  coherent  view  of  the 
domain  by  integrating  across  particular  instances 
of  knowledge  acquisition,  information  gathered 
from  any  of  the  sources  listed  above  (interviews, 
electronic  mail,  etc.)  is  also  link^^  to  the  top 
layer  in  its  raw  form  to  allow  for  traceability. 

Items  in  the  top  layer  of  the  TDD  may  also 
have  links  into  the  next  layer  of  the  document 
which  describes  the  agent’s  structures  and  behav¬ 
iors  at  the  level  of  the  problem  space  computa¬ 
tional  model  (PSCM).  The  PSCM-level  is  a  de¬ 
scription  of  the  agent’s  behavior  in  terms  of  an 


’In  a  hypertext  document,  links  ate  portions  of 
the  text  that  are  highlighted  in  some  way  and  are  as¬ 
sociated  with  another  document.  If  a  link  is  selected, 
the  document  to  which  it  refers  is  displayed. 
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abstract  model  of  the  Soar  architecture,  indepen¬ 
dent  of  the  particular  implementation  of  the  ar¬ 
chitecture  in  C.  Elach  PSCM-level  document  links 
to  the  sipnbol  Uvel  representation  of  the  agent,  i.e. 
to  a  matching  Soar  code  file  in  the  third  layer  of 
the  TDD.  The  code  files  in  the  third  layer  are  the 
actual  files  that  are  loaded  when  an  agent  is  cre¬ 
ated,  so  thqr  ate  always  current.  Because  the  lay¬ 
ers  are  linked,  a  user  can  work  up  or  down  through 
the  hierardiy.  Working  downward  means  begin¬ 
ning  with  tlM  description  of  a  particular  concept 
and  foUowing  it  through  the  layers  to  its  imple¬ 
mentation.  Working  upward  means  moving  from 
the  agent  code  through  the  layers  to  find  the  jus¬ 
tification  for  a  particular  structure  or  behavior. 

Platform  sind  Justification 

Mosaic  is  a  hypennedia  browser  distributed  by 
the  NCSA.  It  allows  a  user  to  view  docu¬ 
ments  that  contain  plain  text,  formatted  text, 
PostScript,  images  and  diagrams,  audio,  and  dig¬ 
itized  video.  When  combined  with  servers  that 
use  the  HyperText  Transport  Protocol  (HTTP) 
[Bemer8-Lee92],  Mosaic  can  be  used  to  view  doc¬ 
uments  that  are  located  throughout  the  world  on 
machines  that  are  connected  to  the  Internet. 

These  features  mean  that  Mosaic  has  many  ad¬ 
vantages.  First,  because  it  is  already  written  and 
has  a  large  number  of  users  world-wide,  we  do 
not  need  to  spend  time  developing  or  maintain¬ 
ing  our  own  tool.  The  large  base  of  users  also 
means  that  tools  that  support  the  authoring  of 
documents,  such  as  editors  and  translators,  are 
readily  available.  Second,  Mosaic  is  in  the  public 
domain  so  there  is  no  monetary  cost  associated 
with  the  TDD’s  development  or  use.  Third,  Mo¬ 
saic  tuns  on  many  Unix  workstations  and  on  the 
Macintosh,  so  all  group  participants  are  able  to 
access  the  documentation  regardless  of  the  ma¬ 
chine  they  normally  use.  Fourth,  through  the 
use  of  a  server,  all  members  can  access  the  same 
copies  of  the  document  at  all  times.*  Changes  are 
immediately  accessible  to  all,  reducing  the  ^ance 
of  out-of-date  documentation  causing  confusion. 
Finally,  because  of  the  multimedia  capabilities, 
we  can  include  such  items  as  diagrams  for  describ¬ 
ing  tactics  and  ^^aneuvers,  images  of  equipment, 
and  unmodified  electronic  mail  messages.  This 
flexibility  allows  us  to  use  the  most  appropriate 
means  to  store  and  convey  information. 

The  Knowledge  Level 

The  top  layer  of  the  TDD  describes  the  domain  of 
air-to-air  combat  at  the  knowledge-level,  i.e.  in¬ 
dependent  of  any  particular  computational  imple- 

*Aayoae  with 

access  to  Mosaic  can  view  onr  documentation  with 
the  URL  http://krnsty.eec8.uniich.edtt/ifor. 


mentation  of  the  dommn.  The  information  con¬ 
tained  in  this  layer  is  of  a  general  nature  and  is  ob¬ 
tained  from  a  number  of  sources.  All  source  ma¬ 
terial  is  kept  and  is  referenced  by  the  domain  de¬ 
scription  to  allow  for  traceability.  Interviews  with 
pilots  are  videotaped  whenever  possible.  (These 
are  turned  into  documents  that  are  considered  the 
source  of  this  information.  The  videotapes  them¬ 
selves  are  also  kept.)  Telephone  interviews  are 
also  turned  into  source  documents.  Electronic 
mail  messages  are  given  a  standard  identifying 
header.  For  information  from  manuals  and  books, 
a  bibliography-style  entry  is  kept. 

Knowledge  from  these  disparate  sources  is  or¬ 
ganized  into  a  coherent  whole  by  the  use  of  a 
topical  tree-like  structure  and  an  index  by  topic 
area  Reorganizing  the  information  in  this  way 
has  a  numba  of  benefits.  Combining  from  mul¬ 
tiple  sources  on  the  same  topic  quickly  reveals 
contradictions  and  missing  or  unclear  informa¬ 
tion.  Further,  as  detail  is  added  to  specific  topics, 
the  new  iitformation  is  near  the  older,  more  gen¬ 
eral  knowledge  and  so  is  easy  to  locate.  Since  all 
of  the  organized  documents  are  given  similar  for¬ 
mats,  they  are  easier  to  browse  than  the  various 
source  documents.  Finally,  browsing  the  informa¬ 
tion  topically  is  generally  the  easiest  method  for 
users. 

The  PSCM  Level 

The  layer  below  the  knowledge  level  gives  a  de¬ 
scription  of  the  agent’s  behavior  in  terms  of  an 
abstract  model  of  the  Soar  architecture,  indepen¬ 
dent  of  the  particular  implementation  of  the  ar¬ 
chitecture  in  C.  The  PSCM  is  the  basis  of  Soar 
and  is  the  conunon  view  shared  by  the  project 
participants.  It  is  a  view  of  problem  solving  be¬ 
havior  in  which  the  agent  pursues  ’ts  goak  by 
applying  operators  to  the  current  state  thereby 
deriving  a  new  state  in  an  iterative  process  until 
the  goal  state  is  achieved.  Thus,  this  level  of  de¬ 
scription  m^  domiun  knowledge  into  the  form 
of  goals,  operators,  and  state  information.'*  By 
following  links  between  the  knowledge-level  doc¬ 
uments  and  those 'at  the  PSCM  level,  the  effect  of 
the  knowledge  on  the  structure  of  the  agent  can 
be  determined.  In  addition,  areas  of  the  agent 
that  would  be  affected  by  additional  information 
in  a  particular  domain  area  can  be  found.  Since 
the  PSCM  is  a  specific  instance  of  many  ideas 
that  are  common  within  AI,  this  level  of  represen¬ 
tation  may  be  accessible  to  many  subject  matter 
experts  as  well. 

*  An  understauding  of  Soar  and  the  PSCM  can  be 
gained  from  [Newell90]  and  [Rosenbloom93]. 
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The  Symbol  Level 

Theie  is  a  one-to-ooe  mapping  between  the  doc¬ 
uments  at  the  PSCM  levd  and  the  Soar  agent 
code,  the  lowest  layer  of  the  TDD.  These  files 
can  ^  viewed  to  see  how  the  definition  of  a  prob¬ 
lem  qMce  or  operator  was  realised  in  code  that 
executes  within  the  implementation  ci  the  atdii- 
tecture.  Since  the  PSCM  is  an  abstraction,  design 
choices  at  that  level  may  have  many  realbations 
in  the  code.  By  separating  the  o(^e  level  into 
its  own  layer,  we  aim  separate  the  general  con¬ 
straints  on  th^  realization  (e.g.  that  it  should  be 
an  operator  rather  than  st^)  from  the  realiza¬ 
tion  itself.  By  linking  the  two  layers  we  maintun 
a  recwd  of  the  origin  of  our  coding  choices. 

Example  of  Use 

Figure  1  shows  the  root  of  documents  which  make 
up  the  knowledge  level.  Underlined  words  and 
phrases  indicate  links  to  other  documents.  The 
links  Geometry  through  Communication  connect 
to  parts  of  the  knowledge-level  document  hierar¬ 
chy  that  cover  those 

topics.  TacAir-Soar  Goal/Operator  Hierarchy 
Unks  to  the  root  document  of  the  PSCM  level. 
Tom  Brandt  at  UM,  July  23,  1993  (2vN)  links  to 
a  murce  document  that  was  created  from  a  video¬ 
taped  interview. 

Following  a  piece  of  the  topical  organization 
downward,  figure  2  shows  the  document  that  is 
linked  to  by  the  Geometry  link  of  figure  1.  This 
document,  still  part  of  the  knowledge  level,  has  a 
figure  that  shows  the  terms  used  to  describe  the 
geometry  of  two  aircraft.  The  link  Thrget  Aspmt 
^en  leads  to  the  document  in  figure  3,  which 
pves  a  narrative  description  of  target  aspect. 
This  description  ends  with  a  link  that  goes  to 
a  knowledge  level  document  that  covers  the  re¬ 
lated  topic  of  lateral  separation.  Below  the  de¬ 
scription  of  target  aspect  are  links  to  other  rel¬ 
evant  knowledge-level  concepts,  as  well  as  a  link 
to  the  source  for  this  document’s  information. 

At  the  bottom  of  figure  3  there  are  links  to 
the  three  documents  in  the  PSCM  level  of  the 
TDD  that  involve  target  aspect.  Following  one 
of  these,  Cut-to-ta.  le^  to  the  document  in  fig¬ 
ure  4,  which  describes  the  operator  that  is  us^ 
to  cause  the  agent  to  turn  its  aircraft  in  order  to 
achieve  a  desin^  target  aspect.  Finally,  following 
the  link  /t<y-ps/. . .  /eut-to-ta.8oar  into  the  sym¬ 
bol  level  of  the  TDD  displays  the  Soar  produc¬ 
tions  that  implement  the  PSCM-level  operator, 
as  shown  in  figure  5. 

Conclusions 

Initial  knowledge  acquisition  results  in  changes 
to  the  domain  level  of  the  document.  This  infor¬ 
mation  is  in  a  form  that  both  project  members 
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Figure  1:  Root  of  the  Knowledge  Documentation. 
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Figure  2:  The  Geometry  Knowledge  Document. 
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Figure  3:  The  Target  Aspect  Knowledge  Docu¬ 
ment. 


Figure  4:  Hie  Cut-to-ta  PSCM  document. 


and  non-project  members  can  access  and  eval¬ 
uate.  Designers  can  then  decide  how  to  realize 
the  new  information  within  the  agent.  Hiis  dis¬ 
cussion  tends  to  take  place  at  the  abstract  level 
of  the  PSCM,  and  is  subsequently  recorded  as 
additions  or  changes  to  the  PSCM  layer.  Once 
the  PSCM  design  is  complete,  coding  can  begin. 
If  there  are  decisions  that  cannot  be  made  un¬ 
ambiguously  at  the  symbol  level,  pointers  back 
through  the  PSCM  can  help  direct  attention  to 
other  parts  of  the  code  that  may  be  relevant. 

The  system  we  have  described  has  proven  to 
work  well.  Both  project  members  and  domain 
experts  are  able  to  access  the  information.  The 
ability  to  include  images  and  diagrams  has  proven 
to  be  very  useful. 

There  are  a  number  of  limitations  which  need 
to  be  explored  in  more  deUul.  The  question 
of  centralized  vs.  de-centralized  control  of  the 
document  has  become  increasingly  important,  as 
is  usual  for  any  dynamically  chanpng  resource. 
Centralized  control  has  thus  far  ensured  that  all 
have  access  to  consistent  information.  However, 
this  has  proven  to  be  a  bottleneck  in  the  pro¬ 
cess  of  adding  new  information.  Also  at  issue  are 
the  possible  roles  of  digital  libraries  in  expanding 
the  type  of  information  that  can  be  included  in 
the  “document”  (e.g.  video  of  experiments  rather 
than  just  transcriptions)  and  of  the  Internet  in 
making  such  a  document  more  widely  available. 
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Abstract 

The  fielding  of  large  itamben  of  OMtouomoue 
eompmUr-geueratei  fonee  requkee  that  tkeee 
foreet  be  able  to  eoordmate  their  behaeiore. 
fntkm  the  ntilitarg,  there  are  mang  leeeU  of  eo- 
ordmatiou,  from  the  high-leeel  maaagettumi  of 
a  theater  of  war,  down  to  the  low-lead  mterac- 
tione  of  mdiaidual  eoUiere.  TaeAir-Soar  repre- 
eente  a  data  point  at  thie  low  level,  where  mdi¬ 
aidual  filter  planet  mutt  fly  togetJuT  in  teeUont 
with  aapport  from  a  air  intercept  controller.  In 
due  prefer  we  analyze  the  typet  of  coordinated  be¬ 
havior  reguired  to  make  TaeAir-Soar  a  reaUetie 
model  of  human  bdtaaior,  the  methodt  that  our 
agentt  employ  to  eoordmate  their  behavior,  and 
finally,  the  oonetraintt  coordination  placet  on  the 
demgn  of  compateT-generated  forcu. 

Introduction 

One  of  the  nltiiiiate  goab  of  reaeardi  in  oomimter' 
senerated  fbroee  is  to  piqnilate  Emulated  battle¬ 
fields  with  automated  intdUsent  asents^  wUdi 
behsweashimMUMwouldonarealbattlefleM.  Al- 
thou^  we  can  make  prosress  bgr  creating  more 
and  more  individual  agents,  we  wiD  stiil  be  br 
dMct  of  moddng  human  bdiamor  unless  we  cre¬ 
ate  agents  that  coordinate  thdr  behavior. 

The  reasons  for  coordinating  the  bdiatvior  of  in- 
AddusJs  ace  obvious.  Aani^umthasonly  fim- 
ited  ability  to  sense  its  environment  <firectty,  and 
limited  wept  in  vddch  it  can  act  on  its  emdroo- 
menL  Throng  ooorcDnation  of  — multiple 
agents  can  titare  their  knovdedge  about  tim  ei^ 
lonmfnt,  thus  making  adiatever  action  they  take 
more  effective.  Throng  coordination  of  th^  ac¬ 
tions,  multqde  agents  can  perform  actions  that  no 
tini^  agent  can  perform,  such  ascreatingdver- 
tioM  and  sityporting  actions,  or  bringtag  to  bear 
fire  power  t^  no  nn^  ag^  has  alone.  The 
proUem  is  how  to  get  mangr  Afferent  agents,  in 

tids  paper  we  win  use  the  term  sfeet 
to  ssCbt  to  a  oon^patergeoecated  entity,  each  as 

a  pilot  of  a  finite  plaK 


different  physical  locations,  with  different  tnoHA 
of  the  environment,  with  different  phytieal  abil¬ 
ities,  and  postibly  different  duwt-tecm  goals,  to 
woric  together  to  achieve  the  most  effective  re¬ 
sults. 

In  the  past,  computer-generated  forces  have 
taken  one  of  three  approaches: 

1.  No  coordination 

Many  conqiuter  generated  forces  do  not  at¬ 
tempt  to  coordinate  their  bdiavior  with  any 
otherforces.  Thty  have  a  qpedflc  misahm  that 
tbqr  are  to  execute,  and  thty  execute  the  mis- 
non  indq>endent  other  Mendly  forces,  hi 
maiqr  semi-automated  forces  (SAfOSs),  it  is 
left  for  an  overseeing  hmnan  to  mgaitiae  their 
behavior.  Sometimes  this  recpiires  that  the  hu¬ 
man  "mkro-manag^  theindiiddualuidts,  and 
in  the  heat  of  battle,  the  human  can  beenme 
overloaded. 

2.  Centralised  control 

When  ti|^  ooordinatiem  of  bdiavior  of  a  small 
unit  is  recpiired,  the  cimimcm  approach  is  to 
treat  the  aggref^tiem  as  a  nn^  uuit  in  terms 
of  bdiavior.  Far  exanqile,  indiiddoal  tanks 
may  be  rqwesented  on  the  bidtiefidd,  but  their 
b^andor  is  organised  into  platoons  and  com¬ 
panies.  Instead  of  attempting  to  rqireaenfc  the 
commnnicwticwi  and  coendinatinn  ct  the  infi- 
vidual  tanks,  behaevior  is  generated  for  the  pla- 
tocm  (or  company)  as  a  whole  and  then  spe¬ 
cialized  for  the  incUvidual  unit  (a  tank).  Each 
unit  does  not  indqicnclentiy  reason  ebout  fts 
behaevior  and  there  b  no  eaqiiidt  communlea- 
tiott  betvraen  units. 

3.  Explicit  cocunand  and  control 

In  a  Gmited  number  of  cases,  contyuter- 
generated  forces  have  generated  caqhdt  orders 
to  lower^chdon  fences,  as  in  Ea^  11  [Pewdl 
and  Hutchinson,  1993].  However,  tins  did  not 
indude  real-time  interwtiem  between  indepen¬ 
dent 

The  conduncm  is  that  only  linuted  progress 
has  been  made  in  creating  agents  that  coorhnate 
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th«r  behavior  in  fienble  ways.  Unfortunatdy, 
solving  the  general  problem  <d  dynamically  otga- 
niring  multiple  agents  to  maximiTe  their  coordi¬ 
nation  is  an  intractaUei»oUein.  Howev«r,tocre- 
ate  coordinated  automated  forces  does  not  require 
a  complete  solution  to  this  problem.  Wecanlimit 
oursdves  to  modeling  the  methods  and  practices 
currently  used  by  military  organizations.  Within 
the  military,  tlw  command  structure  is  a  rd- 
ativdy  static  hierarchy,  where  preplanning  and 
training  are  used  extennvdy  to  a^^  the  oom- 
plerdties,  ddays,  communication  diflkulties,  and 
posdble  confiision  that  can  arise  enth  dyriamic 
reorganization  or  retasking  of  the  partidpants. 
Abo,  mncfa  of  the  bdiavior  is  deterrdned  by  pte- 
deflrwd  tactics  and  doctrine,  which  reduces  the 
need  for  comnwmication.  Tl^  b  not  to  say  that 
such  reorganizations  and  retasldngs  are  not  pos¬ 
able,  It  b  just  that  th^  are  hdd  to  a  fninifmmi, 
and  are  based  on  wdl  defined  procedures. 

Thus,  our  goal  b  not  to  den^op  new  forms  of 
coordination  and  commurucation  for  the  military 
(they  have  been  working  on  thb  for  thousands  of 
years),  but  instead  to  create  computer-generated 
forces  that  can  partidpate  in  coonfinated  bdiav- 
ior  within  the  limita  (and  breadth)  of  a  military 
organization.  Our  goal  b  to  identify  the  types 
of  coordination  and  communication  must  be 
siqrported  by  an  inteUigent  agent  and  then  exam¬ 
ine  how  tins  impacts  on  the  dedgn  of  computer 
generated  forces. 

Our  approadi  starts  with  individual  umts  that 
indq>€ndeotly  reason  about  thdr  own  bdiavior 
and  coordinate  thdr  bebsdors  udng  eaqifidt  com¬ 
munication  as  wdl  as  shared  tactics  and  doctrine. 
We  plan  on  udng  explidt  command  and  control, 
but  with  the  intent  tiiat  it  b  ubiqmtous  and  used 
more  fleodbfy  and  robustly  than  has  been  demon¬ 
strated  to  date.  Some  of  the  advantages  of  that 
approach  are  as  follows: 

1.  Coordinated  bdiavior  vnll  be  more  reafistic. 
Coordination  baaed  on  communication  will  be 
csqifidt,  require  time  to  transmit  and  inter¬ 
pret,  be  open  to  mis-interpretation,  jjtmming, 
etc.  Comdination  based  on  shared  doctrine 
and  tactics  vnll  obey  doctrine,  but  it  irill  also 
fofl  vriien  the  doctrine  hub.  In  ad<fition,  by  in- 
dq>endent]y  modeling  each  entity  (instead  of 
a  groiq>  as  a  whde),  it  should  malm  it  eader 
to  mo^  doctrine  where  the  individual  unit  or 
8ubgro«q>  b  expect^  to  have  initiative. 

2.  Coorefinated  bdiavior  should  scab  op  to  higher 
bvds  of  command.  Instead  of  tr^ng  to  cre¬ 
ate  larger  and  lar^  aggregate  forces  that  are 
centralfy  controlled,  commanding  agents  are 
created  (sudi  as  platoon,  conqiaiiy,  battal¬ 
ion  oomniandccs)  whose  purpose  b  to  generate 
commands  for  lower  bv^  and  rqiort  bade  to 


higher  bvds. 

3.  Coordinated  bdiavim  should  be  for  liu- 
mus  to  understand  because  there  will  be  ex¬ 
plicit  communication  that  can  be  observed. 

4.  Coordinated  bdiavior  between  >»«««"»«  and 
cmnputer  generated  forces  will  be  posaibb. 

In  thb  paper,  we  rqnirt  cm  the  first  steps  at 
coordinated  bdiavior  within  automated  forces  by 
examining  our  inqibmentation  of  the  coor^na- 
tion  required  for  two  planes  flying  tactical  air 
mhsiens  as  a  section.  Of  necessity,  we  have 
been  studying  low-bvd  real-time  coordination 
that  arbes  during  the  execution  of  a  qiedfic  nua- 
don.  We  have  not  atudied  the  longer  term  coor- 
dinathm  that  b  required  at  hi|h^  bvds  of  the 
command  hierarchy  sudi  as  managng  an  air  w 
ground  campiugn. 

Our  i>  b>  treat  our  work  as  a  case 

study.  We  start  fay  analyang  the  coordination 
required  for  flying  two  planes  in  a  aectkm  in  our 
current  imidenientation.  Next  we  study  the  vari¬ 
ous  method  that  conqiuter  generated  forces  can 
use  to  obtain  the  knowledge  required  to  coordi¬ 
nate  their  bdiavior.  Thb  leads  to  the  main  pdnt 
the  paper  vriuefa  b  to  identify  how  coordination 
impacts  the  design  of  conqiuter  generated  forces. 

Example  Scenario 

The  environment  in  vriiidi  we  are  stud^ng  co- 
ordinatwm  b  tactical  air  combat,  as  part  of  the 
Soar/IFOR  component  of  WISSARD.  The  agents 
we  ate  modding  indude  fl^iter  planes,  sudi  as  F- 
14’s  and  MiG-29*s,  and  intercept  controllers 
(AIC)  in  AWACS-Hke  planes  such  as  the  Er-2C. 
Our  IFOR  agents  are  bi^t  in  ThcAir-Soar  [Roeen- 
blomn  et  aL,  1994]  vrithin  the  Soar  ardiitecture 
(Laird  et  at,  1987]  and  interact  with  the  DIS 
wnkl  throui^  Mo^AF  [Calder  et  cL,  1993). 
Eiacfa  agent  b  indqiendentiy  ntuated  in  its  own 
vdii<^  (sudi  as  an  F-14,  fm  MiG-29,  or  an  E-2C), 
and  b  restricted  to  petedving  vriiat  b  availaUe 
on  its  own  vdiicb’s  sensors.  Our  agents  coninm- 
mcate  via  radb  messages  that  approximate  the 
messages  sent  by  human  inlots.  In  our  current  im- 
idementation,  our  agents  pedotm  2v2  intercqits 
(as  dther  red  or  blue,  or  both). 

Ckmnder  the  scenario  in  Figure  1  in  which  two 
blue  flf^liter  planes  (F-M’a)  are  flying  together  as 
a  section  in  a  combat-air  patrol  (CAP)  protect¬ 
ing  an  aircraft  carrier,  with  hdp  fimm  an  air  in- 
teroqit  oontrdbr  on  an  Ei-2.  The  distances  and 
dzea  of  planes  are  not  to  scale.  Two  ted  enemy 
enemy  planes  (MiG-29b)  ate  coming  in  frmn  the 
east  to  attadc  the  aircraft  carrier;  poring  a  threat 
that  the  blue  fll|^iters  must  re^nd  to.  In  the 
remainder  of  thb  section,  we  present  the  types 
of  coordination  implemented  within  ThcAhvSoar 
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uang  exanqdes  from  this  scenario  as  the  blue  and 
ted  fii^iten  engag& 

Fl^ng  as  a  Section 

HistoricaUjr,  a  section  of  two  planes  has  been 
found  to  1^  the  mmimal  effective  fighting  umt. 
A  section  oonasts  of  a  lead  and  a  wingman  fly¬ 
ing  together  on  a  joint  nusnon.  The  tactical  lead 
of  the  section  directs  the  maneuvering  of  the  sec¬ 
tion,  dtber  through  his  actions  or  through  eipfidt 
comnwmication  to  the  wingman.  The  p>al  the 
wingman  is  to  stay  in  formation  and  support  the 
activities  of  the  lead  (sudi  as  throu|^  manipu¬ 
lation  of  its  radar).  In  some  drcumstances,  the 
wingnoan  vnll  take  over  as  lead  (sudi  as  if  the 
lead’s  eqi^Muent  malfunctions,  or  the  lead  is  out 
of  n^dies). 

Tb  be  an  effective  section,  the  lead  and  wing¬ 
man  must  oomdinate  thdr  maneuvering,  thdr 
scoring  of  the  environment,  thdr  employineot  of 
we^tons,  and  the  organization  of  thdr  section. 
Bdow  is  a  detailed  list  of  the  bdiaviors  that 
have  been  implemented  in  TbcAir-Soar  to  coor¬ 
dinate  behavimr  for  beyond-visual-raage  engage¬ 
ments  such  as  in  Figure  1.  These  descr^tions 
(and  our  implementations)  are  idealizations  of  the 
real  bchariors,  but  capture  mudi  of  the  essence 
of  the  real  bdtaviors. 

Maneuvering 

•  Joining  rq>  in  Formation 

the  plaM  of  a  section  are  split,  posribly 


after  taking  off,  or  following  an  engagement, 
the  planes  must  jdn  tqr  into  a  formation.  It 
is  the  xesponrihifity  of  the  wingman  to  obtain 
the  correct  porition  and  this  is  done  ««gog  vi¬ 
sual  and  radar  cues  without  rommimkatinn 
Kuwever,  in  cases  when  the  two  planes  are  for 
apart,  the  wingman  may  request  porition  m- 
fotmation  fimn  the  lead.  Althou|^  flying  into 
formation  is  primarily  the  resporiribiEly  of  the 
wingman,  if  ^e  lead  is  for  ahnd,  the  Imd  may 
mameuver,  posribly  employing  a  shadde  turn, 
to  allow  the  wingman  to  ciddi  iq>  as  in  porition 
1  of  Figure  1. 

s  Flying  in  Formation 
A  section  of  planes  caii  fly  in  many  different 
fwrmations,  such  as  defenrive  enml^  spread, 
offensive  conibat  spread,  figbring  wing,  endae, 
or  trafl.  When  flying  in  formatirm,  it  is  the 
reqKmribility  of  the  wingman  to  malnfaMn  the 
.  api«(q[>riate  porition.  In  Figure  1,  the  F-14’s 
are  initially  fo  a  parade  fonruition. 

•  Changing  Formation 
The  qredfic  formation  used  by  a  section  can 
change  as  the  tactical  rituation  changes.  For 
exatrqrle,  a  section  rrugbt  start  off  in  a  ti^  pa¬ 
rade  formation  until  it  gets  to  its  CAP  stathm 
and  then  assume  a  defenrive  combat  qnead  at 
in  poritiem  3  of  the  scenario.  It  maintidfwi  that 
formatimi  until  the  later  parts  of  an  engage¬ 
ment  when  the  planes  are  doring  on  an  enenqr, 
at  winch  time  then  move  to  an  offenrive 
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€omh«fc  spread. 

•  Coordinated  Maneuvering 

As  the  lead  maneuvers,  the  wingman  attempts 
to  stay  in  formation  as  in  pontion  3  of  the 
scenario.  However,  for  large  turns,  the  sec¬ 
tion  must  perform  special  maneuvers  or  else 
the  wingman  enU  get  out  of  formation.  Fht 
emunple,  vdien  the  lead  mshes  to  turn  90  de¬ 
grees  toward  the  wingman,  as  in  position  2,  the 
lead  win  turn  first  and  th»  the  wingman  will 
turn  once  the  lead  crosses  bdiind  him.  Con- 
verady  if  the  lead  wishes  to  turn  away  from 
the  sringman,  the  wingman  turns  first.  Other 
nmneuvers  indude  in-plaoe  turns  and  crossover 
turns. 

•  Ihictical  Maneuvering 

When  engagng  enemy  planes,  a  section  can 
use  qiedal  maneuvers  in  order  to  improve  the 
geometry  of  an  attack,  or  to  confuse  an  enemy. 
Example  maneuvers  indude  a  pincer  (and  half 
luncer),  when  the  two  planes  separate  and  then 
dose  on  an  enemy,  and  a  post-hole,  where  the 
section  files  in  a  trail  maneuver  and  the  lead 
plane  files  in  a  drde  (to  defeat  an  ejqpected 
nusale  and  posdbly  confuse  the  enenqy),  ^v- 
ing  the  lead  to  the  second  plane,  whidi 
then  presses  the  attack.  In  Figure  1,  the  red 
planes  attempt  to  enq>loy  a  pincer  at  podtion 
5.  In  addition,  a  section  ^planes  may  perform 
defennve  maneuvers  togdher,  such  as  jdntiy 
turning  into  the  beam  to  break  radar  lo^  and 
avdd  a  misale.  Finally,  when  attacking  an  en¬ 
emy,  the  wingman  will  usually  attempt  to  sBde 
to  the  outdde  of  the  formation  to  |^ve  the  lead 
better  podtion  for  the  attack,  as  in  podtion  4 
of  the  scenario. 

Sendng 

•  Radar 

By  coordinating  thdr  radars,  two  planes  can 
cover  more  area.  The  detaOs  of  the  *radar 
contract”  can  be  determined  during  the  brief¬ 
ing  before  the  actual  misdon.  When  planes  do 
get  contacts,  th^  communicate  the  relevant 
podtion  information.  Planes  can  also  request 
information  if  th^  have  lost  a  contact. . 

•  Vision 

Because  it  is  sometimes  difikult  for  a  plane  to 
detect  enemy  planes  that  are  behind  it,  an  im¬ 
portant  requrndbility  u  to  diedc  the  rear  of 
the  other  plane.  Another  important  use  at  vi- 
don  is  to  identify  unknown  planes.  Thus,  a 
section  may  q>Ut  up  so  that  one  plane  can  get 
a  risual  identification  while  the  other  is  pod- 
tinned  for  a  shot  if  the  plane  is  hostile. 
type  of  vrithin-risual-range  coordination  is  not 
yet  inq>lemented  in  TacAir-Soar. 


Employing  Weapoiu 

•  Ihrgetlng 

If  there  are  multiple  groups  of  enemy  planes 
aroroaching,  the  lead  (posdUy  with  the  AIC) 
must  deten^ne  vriiich  grotq>  to  attack  first  and 
communicate  this  to  the  wingman. 

•  Sorting 

When  a  section  engages  multiple  enemy  planes, 
it  is  critical  that  the  wingman  and  Imtd  not 
waste  misdles  fay  shooting  at  the  same  plane. 
Thus,  th^  must  sort  the  enemy  planes,  posd- 
bly  fay  range,  altitude  or  arimiith,  so  they  are 
targeting  different  planes.  In  general,  the  lead 
win  take  the  plane  that  is  the  hipest  threat 
(usuaUy  the  lead  of  the  oppodng  sectim). 

Controlling  the  Section 

•  Changing  the  Lead 

The  rde  of  the  planes  within  a  section  can 
change  if  the  wingman  is  in  a  better  tacti¬ 
cal  dtuation,  such  as  having  nune  appropri¬ 
ate  weiqums  or  better  dtuational  awareness. 
When  the  wingman  takes  over,  he  must  assume 
all  of  the  respondbilities  of  the  lead,  and  vice 
versa  for  the  ori^nal  lead. 

Communicating  Intent 

•  Conunitting  an  Enemy 

When  an  enemy  plane  has  been  identified  as 
a  bandit,  and  the  commit  criteria  are  reached, 
the  lead  will  comnnmicate  the  intent  to  inter- 
cq>t  to  the  wingman. 

Flying  a  Section  with  an  AIC  or  GCl 

Normally,  a  section  of  planes  will  have  s(qq>ort 
from  dther  an  airborne  or  ground-based  radar 
(GCn.  These  radars  provide  a  mudi  broader  fuc- 
ture  (^>praodmatdy  250  nms  for  an  E-2)  and  can 
detect  attaddng  planes  wdl  before  the  section  it¬ 
self  will  see  the  attackers  oh  its  radars.  Thus,  for 
blue,  the  AIC  provides  podtion  and  identity  in¬ 
formation  about  other  fdanes.  The  AIC  can  also 
provide  engagement  information  and  redirect  the 
misdon  of  a  section,  although  this  is  not  currentfy 
imidemented  in  Ihc^r-Soar.  For  red,  the  Gd  is 
more  in  control  and  may  direct  the  tactics  used 
by  the  section  (this  is  also  not  inqdemented  in 
IhcAir-Soar). 

Figure  2  is  a  dialogue  produced  between 
ThcAir-Soar  agents  acting  as  an  AIC  (IQwi)  and 
a  section  of  two  F-14’s  (&vdcl21  and  HawU22) 
as  th^  engac^  a  Mi6-29  in  a  i^^itiy  different 
scenario  than  in  Figure  1.  Lines  starting  with  *;* 
are  comments  and  were  not  part  of  the  commmd- 
cation.  We  have  not  attempted  to  diqdicate  the 
communication  produced  fay  humans  exadfy,  but 
instead  we  have  attempted  to  indtide  the  inter- 
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Ki**!:  ki«i«  lia«kl21  your  bogoy  is  *t  bouriag  23  for  143  angola  8 
;  Badi  plana  prafacaa  ita  romwiwi  cation  «ith  ita  call  al^. 

}  Bara  Kiui  la  giving  tlia  bearing  (23  dagraaa).  range  (143  ana)  and  altitude  (8.000  ft). 
Baokl21:  logar 

Kiui:  Uni,  Contact  ia  a  bandit 

:  Kiui  ia  oenfiming  that  tha  bogey  ia  an  eneny  plana. 

HaU121:  banU21,  Contact  ia  a  bandit 
HadU22:  Bogor 

laU121:  haaU21,  Coanit  bearing  23  for  140  angola  8 

i  Kadtl2t  daeidea  ita  coanit  criteria  have  are  acUeved  and  atarta  to  intercept  the  bandit, 
i  Ka»U21  naea  tha  infomation  fvoa  Kiui  to  plat  an  intercept  couraa 
Kiui:  Uni,  hadd21  your  bogay  ia  at  bearing  21  for  137  angola  8 

i  Xini  periodically  reporta  poaition  Infomation  to  the  fi|^tera. 

Badcl22:  logar 

Kini:  Uni,  Bandit  ia  cloaing  on  a  hot  nectar 
BaUlBl:  hankl21.  Bandit  ia  cloaing  on  a  hot  nectar 
Badcl2i:  hadcl2t,  Co  to  defenaine  conbat-apread  fomation. 

:  Tha  aectioa  chaagaa  fomation  for  the  attack. 

Xini:  Uni,  hanU21  your  bogey  ia  at  bearing  12  for  116  angola  8 
Bankl21:  Bogar 
Badcl22:  logar 

Hank121 :  hanU21,  Bandit  ia  cloaing  on  a  hot  nectar 
Badd21:  hankl21.  Fox  three 

:  Badd21  fima  a  long-range  niaaile  and  then  perfoma  an  f-pola  nanaunar. 

BanklBl:  hankl21.  Cranking  right 

Figure  2:  IhuM  of  communications  between  a  F>14  section  (Hawkl21  and  Hawkl22)  and  an  E-2  (Kiwi). 


actions  that  are  necessary  for  the  planes  to  coor¬ 
dinate  thdr  bdiavior. 

Methods  for  Coordination 
Fbr  a  aedaon  to  coordinate  its  behavior,  the  in¬ 
dividual  agents  must  know  many  things.  Th^ 
must  know  the  appropriate  tediniques  and  meth¬ 
ods  for  maneuvering,  sendng,  employing  we^ions 
and  contrdlBng  the  section.  They  must  also  know 
the  spedSc  constraints  under  which  the  current 
nusnon  is  being  flown,  sudi  as  rules  of  engage¬ 
ment,  commit  criteria,  and  so  on.  During  ^e 
misrion,  they  must  also  build  up  thdr  rituational 
awareness,  from  thdr  own  sensors  and  through 
communication  with  others.  Finally,  th^  must 
coor^nate  their  actions  in  the  face  ct  the  world 
around  them.  These  different  types  of  knowledge 
are  acqiured  at  different  times  tiring  the  types  of 
methods  listed  bdow. 

Common  Doctrine  and  Ihctics 
Doctrine  and  tactics  spedfy  methods  and  proce¬ 
dures  frur  briiaving  in  the  world.  This  is  rimilar 
to  social  contracts,  where  independent  agents  can 
create  coorcBnated  briiavior  by  agrering  to  be¬ 
have  in  certrin  ways  under  spedfle  circumstances 
(Shbham  and  Tamenholtz,  1992].  Fbr  example, 
drivers  in  the  Uidted  States  coordinate  their  be¬ 
havior  (and  thus  avrid  accidents)  by  always  driv¬ 
ing  on  (^ri(^  ride  of  a  street.  Sin^arly,  the  lead 
and  wingman  have  a  divirion  of  labor  so  that  they 


are  not  both  trying  to  the  same  activity  (sudi  as 
maintain  formation)  at  the  same  time. 

Fkom  the  perspective  of  coordination,  common 
doctrine  diminates  the  need  for  comimndcation 
(two  cars  pasting  eadi  other  do  not  need  to  nego¬ 
tiate  vriiich  ride  th^  will  pass),  it  allows  an  agent 
to  predict  the  bdiarior  otiier  agents  without 
even  knowing  the  exact  identity  of  the  agent,  and 
it  reduces  the  cognitive  load  on  an  agent  because 
an  agent  does  not  have  to  plan  out  its  bdiavkn' 
from  flist  princqiles. 

In  IhcAir-Soar,  craimon  doctrine  and  tactics 
are  rqwesented  in  Soar’s  long-term  memory  as 
rules  (as  is  an  long-term  knowledge).  Tins  con¬ 
stitutes  the  vast  majority  of  knowles^e  encoded 
in  IbcAa^Soar. 

Mtssion  Briefing 

Before  a  misrion,  the  participants  are  briefed  on 
the  tactical  rituation,  thdr  responsilnlities,  and 
often,  the  responrilnlities  of  others. 

The  briefing  helps  establish  spedfle  operational 
parameters  required  for  coordination,  such  as  the 
specific  partners  ot  a  section,  thdr  formations, 
^  methids  for  oommunication  (ratflo  frequen¬ 
cies,  call  rigns),  the  default  radar  contract,  the 
defriult  meAod  for  sorting  bandits,  any  qredfic 
taetks  the  section  {dans  to  employ,  and  so  on. 

In  TbcAlr-Soar,  the  misrion  and  ^  information 
rdevant  to  the  current  run  is  entered  via  an  ecUtor 
that  is  an  extenrion  of  MbdS  AF.  This  indudes  the 
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ndes  of  the  agents,  the  call  ngn  of  the  agent,  the 
type  of  airplane  bdng  flown,  rules  of  eogagement, 
the  location  of  mission-relevant  landmaAs,  and  so 
forth.  The  information  is  the  loaded  into  Soar’s 
Aort-term  memory,  winch  makes  it  aooesnUe  to 
all  of  the  rules  in  TacAir-Soar. 

Observed  Behavior 

During  a  mission,  the  memhers  of  a  section  can 
directly  observe  each  other’s  behaivior.  Thus,  be¬ 
havior  alone  can  be  a  ngnal  for  coordinating  be¬ 
havior,  as  when  a  lead  makes  a  small  turn,  with¬ 
out  any  explicit  communication. 

In  ThcAir-Soar,  there  is  only  flmited  use  of  co¬ 
ordination  throu^  obstfved  behavior,  vnth  the 
wing  responding  to  small  turns  of  the  lead  being 
the  best  example. 

Explicit  Communication 

The  most  flexible  way  to  coordinate  behavior  is  to 
explidtly  communicate  information  between  two 
agents.  However  many  factors  drive  the  nulitaxy 
to  minimixe  verbal  commumcation  (it  may  be  dif- 
flcult  to  transmit  because  of  terrain  and  environ¬ 
mental  factors,  it  increases  the  cognitive  load  on 
the  agents  that  initiate  and  recdve  them,  and  it 
can  be  jammed,  intercepted,  or  used  to  localize 
the  petition  of  an  agent).  Explidt  communicar 
tion  is  usually  in  natural  language,  and  is  one  of 
the  most  timdy  types  of  communication. 

In  ThcAir-Soar,  explidt  verbal  communication 
is  done  via  timulated  raeflos  (using  the  raeflo 
PDUs).  There  are  a  total  of  appronmatdy 
twenty-flve  difierent  message  types  that  ThcAir- 
Soar  agents  can  send  and  rective  (these  cover  the 
types  ^  coordination  covered  in  the  previous  sec¬ 
tion  induing  messages  for  coordinating  standard 
and  tactical  maneuvering,  requesting  and  send¬ 
ing  information  about  o^er  planes,  emplogdng 
weapons,  and  changing  the  leacL 

Communication  wi^  Thc^r-Soar  is  natural 
enou^  so  that  it  is  postible  for  humans  to  fly  in 
section  with  it  uting  the  HIP  timulator  intetfsce 
(van  Iient  and  Vhay,  1994].  The  HIP  interfsoe 
allows  humans  to  fly  dther  as  lead  or  wingman 
(or  even  as  an  Er2)  and  compose  messages  that 
ThcAir-Soar  can  understand,  while  rectiving  com¬ 
mands  or  acknowledgements  from  ThcAir-Soar. 

Coordination  Capabilities 

Li  this  section,  we  draw  together  the  csquibili- 
ties  required  for  coordination  in  the  tactical  sir 
domain.  This  is  based  on  the  types  of  coordi¬ 
nated  behavior  (maneuvering,  senting,  enqtloy- 
ing  weiqums,  etc.),  and  the  methods  to  «h^ng 
knowledge  (doctrine  and  tactics,  mistion  brief¬ 
ings,  etc.).  These  ciq>abilities  serve  as  a  requke- 
ments  list  to  constructing  an  agent  that  can  co¬ 


ordinate  with  others  in  domains  such  as  i 

air  combat.  For  each  ciq>ability,  we  also  describe 
how  TscAir-Soar  implements  it. 

Eixiensive  Knowledge  Base 

Each  agent  must  have  an  extentive  knosriedge 
base  that  includes  all  of  the  tactics  and  doctrine 
applicable  to  its  postible  roles  in  the  mweiinsis  in 
vriiicfa  it  will  participate.  For  example,  a  wing- 
man  must  have  the  same  knovdedge  of  doctrine 
and  tactics  as  the  lead,  so  that  the  wingman  can 
take  over  when  necessary.  Much  of  this  knowledge 
is  required  even  without  coordination,  but  some 
will  ^  unique  to  coordination  activitim,  such  as 
secticm-levd  tactics. 

In  ThcAir-Soar,  all  of  its  knowledge  is  encoded 
in  a  rule-base  of  over  1400niles.  Its  doctrine  and 
tactics  are  encoded  as  a  bieranhy  of  intertvrined 
goals  that  are  dynamically  instantiated  based  on 
the  current  tituation  and  mistion. 

Parameter-driven  Behavior 

An  agent  must  be  able  to  perform  a  variety  of 
activities  in  coordination  wi^  others,  such  as  de¬ 
fined  by  a  mistion  briefing.  The  agent’s  behavior 
must  be  parameterized  so  that  the  knowledge  rd- 
evant  to  the  current  mistion  is  usef’  These  may 
sound  trivial,  but  for  some  complex  mistiona,  the 
intomationin  the  briefing  may  involve  fragnients 
of.  plans  that  the  agent  must  integrate  into  its 
ovoraU  bdiavior  at  the  iq>propriate  times.  Thus, 
the  generators  oi  the  agent’s  bdutviw  must  be 
fleable  enou^  so  that  they  can  be  modified  dur¬ 
ing  a  briefing. 

Althou^  one  mi^  be  tempted  sdectivdy  to 
build  the  knowledge  base  of  an  agent  during  the 
mistion  briefing,  this  would  greatly  restrict  the 
abilities  of  that  agent  during  the  execution  of  a 
mistion  because  ^  the  dynamic  nature  of  mis- 
tions.  For  example,  once  the  planes  have  taken 
off  and  are  head^  to  thtir  Ori^nal  CAP  station, 
the  tituation  may  change  so  that  thqr  are  redi¬ 
rected  to  a  different  CAP  station. 

In  ThcAir-Soar,  all  mistion-related  bdiavior  is 
based  on  a  representation  of  the  current  mistion 
that  is  bdd  in  a  working  memory.  Thiscanbeex- 
amined  by  the  rules  tl^  make  up  its  long-term 
knowledge.  The  nustion  can  be  specified  at  brief¬ 
ing  time,  but  also  can  be  dynamically  changed 
during  the  mistion. 

Reactive  Execution 

In  order  to  respond  qvdckly  to  dianges  in  a  part¬ 
ner’s  behavior,  an  agent  must  be  reactive.  Of 
course,  cmiqiuter  generated  forces  must  in  gen¬ 
eral  be.  reactive,  but  coordination  requires  that 
thqr  sometimes  dosdy  monitor  the  activities  of 
other  fiiendly  agents.  When  flying  in  a  section. 
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of  the  lead,  as  well  as  the  cuirent  spadng  between 
the  planes. 

In  IhcAir-Soar,  the  wingman’s  main  goal  is  to 
fly  in  fbnnaUon  with  the  lead.  Rules  are  con¬ 
stantly  the  lead’s  actions  and  the  po- 

ddon  of  the  wingman  relative  to  the  lead.  When¬ 
ever  the  wingman  is  out  of  position,  rules  fire  to 
modify  the  heading,  speed,  or  altitude  as  neoes- 
saiy. 

Interruptible  Processing 

In  being  reactive,  an  agent  is  dian^ng  its  btiiav- 
ior  in  nspoaae  to  the  environment,  however  it  is 
not  performing  any  extensive  reasoning,  nor  is  it 
neomsarily  interrupting  its  ongoing  goals.  How¬ 
ever,  when  an  agent  is  communicating  with  other 
agents,  it  must  often  interrupt  its  current  goals 
both  to  process  the  communication  and  to  change 
its  bdiavior  in  response  to  a  message.  Fbr  ex¬ 
ample,  an  agent  may  be  flying  an  intercept  of  a 
benefit  based  on  prendous  information  from  an  E- 
2.  When  a  new  message  arrives  with  new  posi¬ 
tion  infonnation  on  the  bandit,  the  agent  must 
acknowledge  the  message,  possibly  abandon  its 
current  heading  and  compute  a  new  heading. 

In  Soar,  we  have  spUt  the  procesting  of  incom¬ 
ing  communications  into  two  steps.  Tbe  first  is  a 
hi^  priority  actiwty  that  categorizes  the  message 
and  modifies  the  internal  state  of  the  agent  in  re¬ 
sponse  to  the  message.  The  purpose  is  for  this  to 
happen  quickly  before  other  messages  overwrite 
it.  FoDowing  this,  rules  sentitive  to  the  diange 
will  suggest  dianges  to  the  current  activities  that 
the  agent  is  pursmng.  A  more  extentive  exam¬ 
ination  of  the  problem  of  int^rating  communi¬ 
cation  (and  niUural  language  procesting)  within 
Soar  systems  has  been  done  within  the  context 
of  modding  the  NASA  Test  Director,  who  is  re- 
sponnble  for  coordinating  the  iaundi  of  the  Space 
buttle  {Ndson  et  aL,  1994]. 

Tlranslate  iutemal  information  into 
messages 

In  order  to  communicate  with  other  agents,  an 
agent  must  be  able  to  translate  its  internal  in¬ 
formation  about  its  goals,  its  perception  of  the 
world,  and  its  current  actions  into  a  form  that 
can  be  understood  by  other  agents.  To  do  this 
right  in  general  requires  solving  the  natural  lan¬ 
guage  generation  problem. 

In  the  current  version  of  IhcAir-Soar,  we  are 
finessing  the  general  problem  and  using  an  ad 
hoc  approadi  where  we  prespedfy  the  messages 
that  the  system  can  generate  and  when  it  should 
generate  them.  Thus,  our  agents  do  not  e]q>lic- 
itly  plan  thdr  communications  nor  do  they  dy¬ 
namically  construct  messages  firom  the  impropri¬ 


ate  pieces.  Instead  they  fill  in  prespedfied  tem¬ 
plates.  This  approadi  has  been  successful  for  the 
limited  types  of  communication  our  agents  need 
to  produce,  but  will  break  down  when  we  get  to 
more  conmlex  interactions  and  for  these  we  are 
investigating  more  general  approaches  [Rubinoff^ 
and  Lehman,  1994].  The  form  of  our  messages  u 
based  on  the  "Conm  Brevify”  lists  of  teems  used 
fay  Navy  pilots.  This  list  contains  over  150  terms, 
of  whidi  we  use  only  those  required  fw  our  cur¬ 
rent  levd  at  coordination,  which  is  apprmrim«t*ly 
30  terms. 

The  most  problematic  type  of  communication 
is  when  an  agent  wishes  to  rrfer  to  another  pbme. 
Fbr  example,  when  the  lead  vnshes  to  t^  the 
wir  jman  that  th^  are  committing  to  a  ban£t, 
the  lead  needs  to  spedfy  which  bandit  it  is.  In¬ 
ternal  to  the  lead,  tlus  may  be  represented  fay  an 
internally  generated  name  (such  as  B12),  but  the 
lead  can  not  use  that  in  the  c»mmunkation.  In¬ 
stead,  the  lead  must  use  positional  information, 
such  as  the  bearing,  range,  and  altitude  of  the 
bogey.  This  is  problematic  because  the  poational 
information  is  inexact  and  time  dependent. 

Translate  messages  into  internal 
information 

The  converse  of  the  prior  problem  is  translating 
messagesfromotheragentsintoanintcmalrqiMre- 
sentatiem  that  the  agent  can  work  with.  Asabove, 
to  do  this  ri^  in  general  recpiires  solving  the  nat¬ 
ural  language  understanding  problem. 

In  the  current  veraon  of  TacAir-Soar,  we  are 
also  finesnng  this  problem  by  only  accepting  the 
message  fypes  that  mir  agents  generate  (althou^ 
we  are  also  evaminiug  more  general  api«oaches 
[Rulnnoff  and  Lehman,  1994]).  By  limiting  the 
fypes  of  messages  the  ^stem  can  acoq)t,  it  is 
straightforward  to  translate  the  messages  into  the 
internal  goals,  actions,  and  state  information  of 
our  agents.  As  above,  the  most  problematic  tadc 
is  h«nHliiig  references  to  other  agents,  and  this 
is  done  by  finding  the  agoit  in  the  envinniment 
that  most  dosdy  matches  the  descriptiem  it  is 
amt  [Jones  and  Laird,  1994). 

Conclusion 

The  purpose  of  this  paper  is  to  examine  the  ci4>a- 
bilities  reejuired  in  a  computer-generated  force  to 
support  cxxnxlination.  We  have  studied  the  low 
end  of  coordination  as  implemented  in  TbcAir- 
Soar,  where  there  are  tight  interactions  between 
the  agents  involvecL  TbcAir-Soar  is  proof  that 
such  coordination  is  posable,  but  that  it  reciuired 
knowledge-tich,  reactive,  interruptible  process¬ 
ing,  with  high  freciuency  of  rdatively  short  mes¬ 
sages.  Our  long  term  goal  is  to  study  coordinar 
tion  across  the  <»mmand  luerarchy.  As  we  move 


up  the  command  hierardiy,  we  would  expect  that 
the  firequenqr  of  the  messages  will  decrease  and 
the  of  the  message  will  increase,  pladng 

less  emphaws  on  reactiWty  and  interruptibility 
and  more  emphasis  on  the  process  of  interpret¬ 
ing  and  generaUng  messages. 
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Abstract 

The  creation  of  autonomous  intelligent  forces 
(IFORs)  for  both  large-scale  distribirted  simu¬ 
lations  and  small-scale,  focussed  training  ex¬ 
ercises  creates  unique  challenges  for  natural 
language  processing.  An  IFOR’s  role  will 
otbea  to  replace  one  or  more  individu¬ 
als  in  an  engagement,  making  the  ability  to 
communicate  in  natural  language  key  to  its 
performance  as  well  its  acceptance  by  other 
paitidpants.  In  this  paper,  we  describe  the 
capabilities  an  IFOR  needs  to  communicate 
appropriatdy  and  discuss  how  the  NL-Soar 
inngiiagft  system  provides  these  capabilities 
for  IhcAir-Soar,  an  IFOR  agent  for  beyond- 
visual-range  combat 

Introduction:  IFORs  and  Commu¬ 
nication 

The  Creadon  of  autonomous  intelligent  forces 
(IFORs)  offos  the  possibility  of  running  both 
large-scale  distributed  .simuladons  and  small- 
scale,  focussed  training  exercises  with  lower 
manpower;  cost  and  lo^stical  support  re¬ 
quirements  than  previously  possible.  How- 
ev&f  since  an  IFOR’s  role  will  often  be  to 
replace  one  or  more  individuals  in  an  engage¬ 
ment  the  ability  to  conununicate  in  natural 
language  can  be  a  key  aspect  of  its  ovoall 
performance.  An  agent  that  carmot  commu¬ 
nicate  at  all  is  severely  limited  in  die  roles  it 
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can  play.  An  agent  that  uses  only  a  highly 
restricted  subset  of  natural  language  may  be 
easily  detectable  as  a  computer-generated  foe, 
one  that  can  be  ’’gamed”  without  providing  die 
training  experience  that  is  the  point  of 
the  exercise.  Further,  an  agent  that  is  unlikely 
to  conqirehend  the  subset  of  language  actually 
used  by  human  participants  puts  an  undue  bur¬ 
den  on  those  participants  to  communicate  in 
a  way  that  it  can  respond  to,  again  changing 
the  rules  of  the  game.  Finally,  an  agent  diat 
is  rigid  in  its  communicadve  ability  may  in¬ 
troduce  a  britdeness  into  the  simuladon  (i.e.  a 
tendency  to  fail  in  unexpected  ways)  that  has 
nothing  to  do  with  imperfecdons  in  strategic 
or  tactical  knowledge. 

Although  the  need  to  address  the  problem 
of  natural  language  processing  for  IFORs  is 
clear,  the  problem  is  complicated  by  the  di¬ 
verse  ways  in  which  NL  can  be  called  on 
to  augment  the  functionality  of  the  agent 
Fbr  example,  in  building  TacAir-Soar,  a  jet- 
tighter  pilot  IFOR  for  beyond-visual-range 
combat  (R’ILR93,  Rjr94],  an  ML  capabU- 
ity  is  needed  for  basic  interaction  ammig  pi¬ 
lot  wing,  and  air  intercept  control  (AIC),  as 
well  as  for  descriptive  explanatitni  both  dur¬ 
ing  fli^  and  in  after-action  review.  We 
are  adapting  the  NL-Soar  language  systmn 
[LLN91,  Lew93]  to  provide  that  capability.* 

‘Here,  we  discuss  our  current  work  in  basic  in- 
lenctive  cranmunication;  see  [Joh94]  for  deudls  on 
explanation  in  after-action  review. 
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Tfaexe  are  three  main  characteristics  of  com¬ 
munication  during  air  condMtt  that  present 
diallenges  for  this  research.  The  first  chal¬ 
lenge  stems  from  the  nature  of  the  task  itself: 
language  processing  occurs  in  real-time,  as 
a  rins^e  aspect  of  behavior  in  a  ccxistantiy 
changing  situation.  Thus,  in  order  to  ade¬ 
quately  simulate  a  human  pilot,  an  IFOR  must 
comiffdiend  and  g^erate  language  at  roughly 
human  rates.  If  it  is  too  slow,  it  will  be  unable 
to  keep  up  with.both  the  linguistic  and  non- 
linguistic  demands  of  the  environment.  If  it 
is  too  fast  it  may  commit  to  actions  before  co¬ 
ordinating  its  behavior  with  other  sources  of 
information  (e.g.  visual  information  from  the 
radar). 

The  second  challenge  stems  from  the  nature 
of  the  implementation:  NL-Soar  must  be  inte¬ 
grated  into  the  structure  of  an  independently- 
designed  system.  The  organization  of  TacAir- 
Soar  (and  of  IFORs  in  general)  derives  from 
the  nature  of  the  task(s)  it  performs;  there  is 
no  a  priori  reason  to  expect  this  organization 
to  be  consistent  with  the  assumptions  under¬ 
lying  NL-Soar’s  design. 

The  third  challenge  stems  from  the  partic¬ 
ular  nature  of  language  in  the  domain.  NL- 
Soar  was  ori^ruilly  designed  to  process  com¬ 
plete  grammatical  sentences.  The  language 
in  tile  tactical  air  combat  domain  differs  firom 
this  both  by  including  **ungremmatical*’  utter¬ 
ances  such  as  sentence  fragments  and  by  con¬ 
taining  many  spedal  purpose  constructions 
(e.g.  **rogei:^  or  the  use  of  call-signs)..  In  the 
rest  of  tius  paper,  we  explore  the  implications 
of  these  ch^enges  in  more  detail  and  discuss 
how  we  are  addressing  them. 

Real-time  Communication 

Communication  in  an  IFOR  must  occur  in 
real-time.  This  is  not  a  statement  about  how 
fast  the  system  must  run,  pa  se.  Rather,  it 
is  atheoretical  statement  about  how  process¬ 


ing  must  occur  within  the  systeriL  Put  sim¬ 
ply,  people  can  comprehend  at  rates  of  about 
2S0  ms^word  (they  tend  to  generate  lan¬ 
guage  a  bit  more  slowly).  Although  tiiere  is 
variability  (some  words  take  as  little  as  SO 
msec,  others  may  take  closer  to  1000  msec), 
the  point  is  that,  in  general,  the  anxiunt  of  time 
is  linear  in  the  number  of  words  in  the  utter¬ 
ance.  A  number  of  design  constraints  follow 
from  this  rinqile  regularity  [LLN99],  e.g.  con- 
structimi  of  tte  meaning  of  the  sentence  must 
proceed  incrementally,  different  knowledge 
sources  (e.g.  syntax,  senumtics,  pragmatics) 
must  be  applied  in  an  integrated  ratiier  tiian 
pipe-lined  or  multi-pass  fashion.  NL-Soar 
provides  these  properties  [IXN91,  Lew93]. 
Briefly,  the  system  relies  on  Soar’s  notimi 
of  impasse  to  control  the  search  tiuon^  its 
linguistic  knowledge  sources,  and  then  on 
Soar’s  learning  mechanism  to  compile  those 
disparate  pieces  of  knowledge  into  an  inte¬ 
gral  form  that  can  be  sppiisd  directly  (i.e. 
in  constant  time/word)  in  the  future. 

To  make  the  nature  of  integratimi  in  NL- 
Soar  more  concrete,  consider  Hgure  1,  a 
gr^hical  representation  of  a  particular  sys¬ 
tem  that  uses  NL-Soar  for  con^rehension  and 
generation.  Linguistic  processes,  like  all  pro¬ 
cesses  in  Soar,  are  cast  as  sequences  of 
erators  (small  arrows)  that  transftvm  istates 
(boxes)  until  a  goal  state  is  adiieved.  The 
triangles  in  the  picture  represent  iproblem 
spaces  which  are  collections  of  operatcxs.^ 
The  comprehension  problem  spaces  contain 
operators  that  use  input  firom  ^  percqptual 
^stem  to  build  syntactic  and  semantic  stn^ 
tures  on  the  state;  the  generation  problem 
spaces  contain  operators  that  use  semantic 
structures  to  produce  syntactic  structures  and 
motor  output  Note  that  the  tiiere  is  a  q)e- 
dal  problem  space,  labelled  Top,  whidi  is 
connected  to  the  perceptual  and  motOT  sys- 

^pOTmote  details  on  how  Soar  uses  probtemq^aoet, 
states  and  opentors  to  organize  its  prooesdng  see 
tNew90,LNR87]. 
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tems.  TlieTopspaceistheonlypioblemspace 
designated  by  Soar  architecture;  all  other 
problem  spaces  are  provided  by  the  system 
designee  The  dotted  lines  in  the  figure  rep- 
resent  Soar's  m4)asses  which  arise  automati¬ 
cally  when  there  is  a  lack  of  knowledge  avail¬ 
able  in  the  current  problem  space.  When  an 
inqnsse  arises,  {wocessing  continues  in  sub- 
spaoes  until  the  goal  state  in  the  subspaoe  is 
reached.  Thick  banded  arrows  represent  die 
resolutimi  of  an  impasse,  when  chunks  are 
formed.  Chunks  are  new  pieces  of  knowledge 
that  are  added  to  the  system.  They  combine 
those  ctmditions  in  the  pre-impasse  problem 
q)ace  that  were  used  to  reach  the  goal  state 
in  the  subspace  with  the  actions  performed  in 
the  subspace  to  reach  the  goal  state. 

What  does  this  mean  for  NL-Soar?  As  an 
example,  conader  the  arrival  of  a  new  word 
into  the  Top  state  in  some  established  context. 
Now  assume  that  we  have  never  seen  the  word 
in  a  similar  context  in  the  past.  Aninqrasse 
will  arise  and  problem  solving  will  continue 
in  the  Comprehension  spaces  until  we  reach 
tile  goal  state  in  which  we  have  defined  the 
Impropriate  syntactic  and  semantic  structures. 
When  we  return  those  structures  to  the  Top 
state,  chunks  will  be  formed.  In  this  case 
the  dinnks  will  propose  operators  directly  in 
the  Top  state  the  next  time  this  word  is  seen 
in  a  rimilar  context.  In  other  words,  the  next 
time,  no  inmasse  will  occur;  the  problem  sol  v- 
ingtiiat  took  place  in  the  subspaces  has  been 
integrated  into  a  small  number  of  Top  space 
(mexstorsth^  execute  directly  to  build  the  rel¬ 
evant  structures  on  the  Top  state. 

A  cmisequence  of  rdying  on  Soar's  learn¬ 
ing  medianism  is  that  achieving  real-time  lan¬ 
guage  behavior  requires  truning  NL-Soar  off- 
line  in  advance.  (Requiring  NL-Soar  to  Team 
while  doin^  would  be  equivalent  to  expect¬ 
ing  tile  pilot  to  leam  the  domain  language 
while  flying  tiie  plane  in  battle.)  When  first 
loaded,  TacAir-Soar/NL-Soar's  lexical,  syn¬ 
tactic,  semantic,  and  discourse  knowledge  are 


all  separate;  it's  as  if  the  IFOR  knew  all  the 
rules  for  how  to  communicate  but  had  no  mc- 
peiienoe  using  them.  Off-Une  training  allows 
NL-Soar  to  learn  fix>m  experience  in  a  non- 
real-time  setting.  This  gives  tiie  system  the 
time  it  needs  to  integrate  its  disparate  knowl¬ 
edge  sources  into  "diunks”  that  NL-Soar  can 
apply  in  a  single  step.  It  is  this  highly  oom- 
pll^  formof  language  knowledge tiiat  models 
an  experienced  pilot  and  provides  real-time 
language  behavior  on-line. 

Integrating  Language  with  the  Task 

As  mentioned  above,  NL-Soar  was  devel¬ 
oped  indqiendently  of  TacAir-Soar.  Indeed, 
as  witii  many  NL  systems,  NL-Soar  was  de¬ 
veloped  independently  of  the  need  to  actually 
do  anything  non-linguistic.  But,  of  course, 
most  language,  and  certainly  the  communica¬ 
tion  between  a  pilot  and  AlC  or  wing,  is  gen¬ 
erated  and  comprehended  in  service  of  some 
taric  that  is,  itself,  essentially  non-linguistic. 
Asaresult,  NL-Soar  must  be  adtqited  to  seem- 
lessly  integrate  the  language  capability  with 
those  non-linguistic  ctqiabilities  in  the  agent, 
e.g.  percqition,  planning,  reasoning  about  tiie 
task.  We  have  successfully  done  this  on 
a  smaller  scale  in  NTD-Soar,  a  non-IFOR 
agen^.  The  structure  of  NTD-Soar,  shown 
in  Figure  1,  is  quite  different  fipom  that  of 
TacAir-Soar.  In  particular,  NTD-Soar  models 
switdiing  between  multiple  tasks  by  invok¬ 
ing  eadi  task  from  the  Top  problem  space; 
if  a  taric  is  interrupted,  its  state  is  preserved 
in  the  Top  space  until  it  can  be  resumed.  In 
essence,  NTD-Soar  models  tasks  in  a  fashion 
similar  to  co-routines.  This  structure  allows 
langua^  to  be  integrated  easily  by  treating  it 
asjust  another  task.  Information  is  transferred 

’NTD-Soar  is  a  model  of  the  NASA  test  director 
sdioisrBspoiidbieforcoordiiuitiagmany  fooets  of  die 
testing  sod  pfq>aratioo  that  the  Space  Shuttle  must  go 
duough  before  it  can  be  launched  [NLJ94]. 
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Figure  1:  Structure  of  NTD-Soar 


between  language  and  other  tasks  by  sharing 
the  common  Top  state  in  the  same  problem 
space  in  which  the  task-switching  control  is 
done. 

TacAir-Soar,  in  contrast,  keeps  only  a  sin- 
^e  task  active  at  a  time,  but  it  maintains  a 
stack  of  levels  of  abstraction  of  that  task,  and 
each  level  stays  active  as  long  as  it  is  being 
carried  out.  Thus  ThcAir-Soar  uses  Soar’s  top 
state  to  ke^  trade  of  the  ’’execute-mission” 
task,  which  stays  active  for  the  entire  simula¬ 
tion.  Under  this  will  be  a  stack  of  sub-tasks, 
sudi  as  ’inig-sweq)”,  ’Intocept”,  ’’ein)loy- 
wef^ns”,  and  so  on,  each  representing  a 
more  ddailed  view  of  what  the  agent  is  cur¬ 
rently  trying  to  do.  Much  of  TacAir-Soar’s 
knowledge  of  its  current  situation  and  goals  is 
stored  in  sub-states  associated  with  these  sub¬ 
spaces,  not  on  the  top  state.  Thus  if  TacAir- 
Soar  switched  to  language  in  its  top  state,  as 
NTD-Soar  does,  it  would  lose  much  of  this 
knowledge. 

Because  of  the  need  to  preserve  TacAir- 
Soar’s  stack  of  subtasks,  we  have  modified 
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NL-Soar  to  operate  at  any  level  of  the  stack 
rather  than  just  at  the  top.  NL-Soar  is  thus  in¬ 
voked  as  a  sub-task  of  die  bottom-level  task, 
I»eserving  the  stack.  The  resulting  Aructure 
canbeseeninHgure2.  This  structure  has  the 
consequence  of  making  language  operate  as  a 
sub-ta^  of  the  domain  task(s).  ratto  than  as 
a  separate  task  alcmgside  diem.  While  this  is 
often  reasonable,  since  the  agent  may  be  talk¬ 
ing  about  whiU  it’s  currendy  doing,  it  is  not 
always  so.  Particularly  in  the  case  of  oompre- 
hentimi,  it  may  turn  out  that  what  someone 
says  to  the  agent  has  to  do  with  an  entirely 
new  task  that  die  agent  will  start  working  on 
because  of  the  communication.  Given  this 
and  related  problems,  we  are  still  exploring 
the  overall  issue  of  how  best  to  integrate  die 
structures  of  TacAir-Soar  and  NL-Soar. 


Using  Realistic  Language 

In  addition  to  developing  NL-Soar  indepen¬ 
dently  of  TacAir-Soar,  it  was  also  developed 
independently  of  the  language  of  the  tacti¬ 
cal  air  domain.  This  has  two  specific  conse¬ 
quences.  First,  NL-Soar  does  not  contain  any 
of  the  domain-specific  words  and  construc¬ 
tions  used  in  tactical  air  combat  Further- 
mme,  NL-Soar  was  deagned  to  contain  only 
competence  knowledge.  The  conqietenoe- 
performance  distinction  [CliotiS]  reflects  the 
difference  between  what  people  would  rec¬ 
ognize  as  fluent  grammaticd  speech,  and 
actual  speech  as  it  occurs  in  everyday  con¬ 
versation.  Thus  NL-Soar  must  be  t^le  to 
conqnehend  and  generate  in  accordance  with 
domain-spedfic  performance  data,  with  all  of 
its  idiosyncratic  constructions,  ungrammat- 
icalities,  self-corrections,  etc.  In  order  to 
help  adapt  NL-Soar  to  this  requirement  we 
have  collected  protocols  of  pilot/AIC  and  pi- 
lotAving/AIC  communication  in  a  number  of 
scmiarios  in  a  simulated  environmmt  Doc¬ 
trine  with  respect  to  communication  is  quite 


qiedfic,  stressing  brevity  and  clarity.  In  addi¬ 
tion  to  a  highly  specialized  lexicon,  this  tends 
to  result  in  a  fairly  agrammatical,  td^nqdiic 
style,  with  periodic  lapses  into  nxxe  standard 
English.  This  can  be  seen,  for  exaiiq>le,  in 
Hgure  3.  which  shows  an  excerpt  fnnn  our 
Iffotocols  in  which  the  AIC  (whose  call-sign 
is  blue  tail)  guides  the  pilot  (whose  call-sign 
is  dakota  204)  to  acquire  his  bogey  (uniden¬ 
tified  radar  contact).  (Punctuation  has  been 
added  to  aid  the  reader.)  We  can  see  here 
the  use  of  domain-specific  forms  at  all  levels: 
syntactic  (using  call-signs  in  every  sentence), 
semantic  (**single”  meaning  *‘a  plane  flying 
unaccompanied“)t  and  discourse  (*Yoger**  to 
admowl^ge  having  heard  someone).  In  ad¬ 
dition,  tile  pilot's  last  utterance  demonstrates 
the  kinds  of  'imperfect"  speech  (here  pauses 
marked  by  *iih"  and  "eh”)  that  NL-Soar  must 
be  able  to  comprehend  and  generate. 

This  challenge  is  easier  to  handle  in  gen¬ 
eration  than  in  comprehension,  because  NL- 
Soar  has  control  over  the  structures  that  pro¬ 
duce  the  surface  form  during  generation;  if  it 
needs  to  genmate  an  "ungrammatical”  stmc- 
tiire,  it  can  simply  build  it  and  mark  it  as  a 
spedalcase.  Of  course,  some  care  must  still 
be  taken  to  make  sure  that  the  special  cases 
aren’t  m  ^neral  (for  example,  the  ability 
to  say  >^'ger”  must  not  allow  NL-Soar  to 
utter  any  word  as  a  single-word  utterance). 
The  problem  is  more  complex  during  com- 
{xebension  because  the  system  is  trying  to 
recover  the  relevant  structures  firom  the  sur¬ 
face  form.  Since  NL-Soar  can’t  know  in  ad¬ 
vance  whether  the  current  uttoance  is  fol¬ 
lowing  ^mal  English  grammar,  a  domain- 
spedfic  grammar,  or  represents  some  sort 
of  speedi  error,  the  search  space  of  possi¬ 
ble  intmprdations  can  become  quite  large. 
There  are  a  number  of  techniques  tiiat  have 
been  developed  for  dealing  vdth  this  problem 
[FB86,  Gra83,  WB80,  Leh90],  although  it  re¬ 
mains  for  us  to  systematically  evaluate  the 
usefulness  of  each  technique  given  the  struc- 
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START 

AIC 

13:30:41 

13:30:46 

PUot 

13:30:48 

13:30:49 

AIC 

13:30:52 

13:31:00 

PUot 

13:31:01 

13:31:04 

AIC 

13:31:06 

13:31:11 

Pilot 

13:31:15 

13:31:22 

AIC 

13:31:23 

13:31*.24 

UTTERANCE 

dakota  2  0  4.  blue  tail,  contact  2  7  0.  approximately  50 
miles. 

dakota  2  0  4  is  clean. 

roger.  dakota  2  0  4  contact  now  2  7  0.  api»oximately  45 

miles.  ai^)ears  to  be  single,  contact  at  angels  18. 

dakota  204  roger.  intermittent  contact 

dakota  204.  contact’s  now  2  7  0.  approximatdy  35  miles. 

dakota  20  4.  contact  on  the  nose,  uh  bearing  2  5  5.  eh  26 

miles. 

roger  dakota  that’s  your  contact 


Hgure  3:  Sample  pUot  conversation 


ture  of  NL-Soar  and  the  particular  linguistic 
phenom^ia  in  the  tactical  air  domaiit 

Summary 

In  this  pq)er,  we  have  discussed  some  of  the 
challenges  involved  in  providing  an  IFOR 
with  communication  capabilities  that  allow 
it  to  successfully  simulate  human  behavior. 
These  include  the  need  to  communicate  in  a 
real-time  environment,  to  iy)propriately  inte¬ 
grate  the  IFOR’s  language  processing  with 
its  task  operations,  and  the  need  to  cope 
with  domain-specific  and  ungrammatical  lan¬ 
guage.  Our  adiqKation  of  the  NL-Soar  lan¬ 
guage  system  to  work  with  the  IhcAir-Soar 
IFOR  a^t  continues  to  be  guided  by  these 
concerns;  despite  a  numbo*  of  unresolved  is¬ 
sues,  we  believe  NL-Soar  has  the  potential  to 
provide  TacAir-Soar  with  the  necessary  kinds 
of  linguistic  behavior. 
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Abstract 

In  order  to  explore  the  domain  of  air-to-air  com¬ 
bat  with  Soar,  a  unified  theory  of  cognition  used 
to  model  human  behavior,  it  was  necessary  to  in¬ 
terface  Soar  to  vehicles  whicli  use  the  Distributed 
Interactive  Simulation  (DIS)  protocol.  Rather 
than  create  what  would  be  in  essence  a  simulator 
of  fighter  aircraft,  the  ModSAF  simulation  system 
was  chosen  to  simulate  fighter  aircraft  and  pro¬ 
vide  a  DIS  interface.  To  link  Soar  and  ModSAF, 
we  have  developed  the  Soar/ModSAF  Interface 
(SMI).  The  SMI  provides  a  simulated  cockpit  for 
Soar  pilots.  To  guide  others  in  the  development  of 
interfaces  for  other  intelligent  systems,  this  paper 
describes  the  SMI  along  with  associated  design 
constraints.  Implementation  details  concerning 
functionality,  modularity,  and  efficiency  are  ad¬ 
dressed.  We  also  identify  issues  arising  from  in¬ 
tegration  difficulties. 

Introduction 

Ck>mputer  modeling  of  intelligent  agent  behavior 
is  a  concern  to  many  researchers  in  the  fields  of 
cognitive  science,  artificial  intelligence,  and  psy¬ 
chology.  The  Soar  community  is  particularly 
interested  in  developing  a  model  which  encom¬ 
passes  a  unified  theory  of  cognition  [Soar].  To  this 
end.  Soar  researcliers  are  interested  in  modeling 
agents  that  operate  in  challenging  environments 
[TacAirJ.  Dynamic  environments  which  require 
the  application  of  a  fair  amount  of  domain  knowl¬ 
edge  offer  a  diverse  set  of  problems  that  must  be 
addressed  in  developing  agents  which  simulate  in¬ 
telligent  behavior.  These  problems  require  the 
development  of  a  number  of  cognitive  facilities  in 
order  to  successfully  simulate  agent  behavior  and 
the  unified  approach  of  Soar  is  helpful  in  merging 
these  facilities  into  a  coherent  whole. 

One  environment  which  provides  these  chal¬ 
lenges  is  the  domain  of  air-to-air  combat.  In 
this  domain,  fighter  pilots  must  make  quick  de¬ 
cisions  concerning  enemy  aircraft  in  the  service 
of  completing  a  mission.  Developing  an  intelli¬ 
gent  vehicle  or  robot  to  operate  in  such  an  envi- 
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Soar 

— 

_ 1 
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Figure  1;  The  Relationship  of  Soar,  ModSAF, 
and  the  SMI. 


ronment  is  much  too  costly  and  the  requirements 
of  sensorimotor  hardware  development  is  too  dis¬ 
tracting  from  the  central  focus  of  behavior  mod¬ 
eling.  Given  these  concerns,  the  natural  testbed 
for  such  development  is  a  simulator.  This  simu¬ 
lator  should  provide  a  rich,  high-fidelity  world  so 
that  modeling  of  pilot  behavior  is  not  perverted 
by  simulation  artifacts.  Fortunately,  the  Mod¬ 
SAF  system  [ModSAF]  provides  a  rich  simulation 
environment  -  it  is  designed  to  simulate  vehicles 
in  cooperation  with  conventional  live  force  exer¬ 
cises. 

ModSAF  provides  a  platform  for  research  into 
the  control  of  all  kinds  of  computer  generated 
forces.  In  essence,  ModSAF  simulates  the  oper¬ 
ation  of  DIS  compatible  vehicles.  These  vehicles 
can  be  directed  by  software-controlled  agents  or 
human  beings.  By  using  ModSAF,  researchers 
can  focus  their  work  on  the  development  of  be¬ 
lievable  agents  rather  than  on  vehicle  simulation 
issues,  such  as  motion  dynamics  and  DIS  net¬ 
working.  Our  work  deals  with  the  problems  of 
interfacing  artificially  intelligent  agents,  modeled 
using  the  Soar  system,  to  ModSAF.  The  mod¬ 
ule  which  supports  the  connection  between  the 
two  systems  is  called  the  Soar/ModSAF  Interface 
(SMI).  Figure  1  shows  the  relationship  between 
Soar,  ModSAF,  and  the  SMI. 
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Soar  agent  condorl01>  p  o62 
(062  “racetrack-dir  092 

‘racetrack-length  093 
‘type  barcap 
‘risk-type  high 
‘heading  046 
‘altitude  047 
‘speed  048 
‘id  *none* 

‘e2c-id  *nonee 
‘level-ol-experience  los 
‘voice  enone* 

‘ground-voice  enonee) 

(092  ‘value  0  ‘units  degrees) 

(093  ‘value  36000  ‘units  Deters) 

(046  ‘value  0  ‘units  degrees) 

(047  ‘value  7900  ‘units  ieet) 

(048  ‘value  320  ‘units  meters/second) 

Soar  agent  condorl01> 

Figure  2:  Working  Memory  Elements  (WMEs) 
representing  vehicle  status  information. 


First  we  discuss  the  abstraction  the  SMI  cre¬ 
ates  for  Soar.  We  then  move  to  the  connection 
between  Soar  and  ModSAF,  and  why  other  alter¬ 
natives  were  not  used.  The  “division  of  labor” 
among  Soar,  ModSAF,  and  the  SMI  is  also  ex¬ 
plained,  followed  by  implementation  details. 

The  Cockpit  Abstraction 

Since  Soar  agents  are  constructed  by  modeling 
human  pilots,  it  is  imperative  that  the  SMI  pro¬ 
vide  an  interface  which  emulates  the  environ¬ 
ment  of  the  human  pilots  -  the  aircraft  cockpit. 
Soar  agents  receive  input  data  corresponding  to 
sensory  information  they  would  obtain  from  the 
cockpit  environment,  e.g.  radar  displays,  radio 
messages,  vehicle  status  indications,  and  visual 
sightings  out  of  the  cockpit  canopy.  This  informa¬ 
tion  is  provided  to  Soar  in  the  form  of  symbolic 
working  memory  elements  (WMEs),  not  images 
or  digitized  audio.  WMEs  are  the  basic  unit  of 
information  on  which  Soar  acts.  Soar  agents  also 
issue  output  commands  to  control  the  vehicle’s 
motion,  radar,  weapons  and  radio.  The  specific 
Soar  I/O  WMEs  defining  the  Application  Pro¬ 
grammer  Interface  (API)  to  the  simulated  cock¬ 
pit  are  documented  elsewhere* .  An  example  of 
the  WMEJs  representing  the  vehicle’s  status  are 
shown  in  figure  2. 

Unfortunately,  there  is  no  cockpit  component 
provided  by  ModSAF.  The  SMI  must  create  this 

’Users  with  access  to  the  World  Wide  Web  on 
the  Internet  can  view  this  information  using  the  URL 
http://krusty.eecs.umich.edu/ifor. 


facility  for  Soar  agents  via  manipulation  of  the 
relevant  ModSAF  components  and  creation  of 
new  facilities.  For  this  reason,  the  SMI  is  not 
a  simple  translation  device  between  the  two  sys¬ 
tems.  Currently,  control  of  a  vehicle  occurs 
through  setting  the  desired  state  of  the  vehicle 
such  as  its  speed,  heading,  and  altitude.  This 
level  of  control  makes  certain  tasks  difficult.  For 
instance,  it  is  not  possible  to  cause  the  vehicle 
to  climb  without  providing  a  desired  altitude. 
Thus,  attempting  to  stay  in  formation  with  an¬ 
other  vehicle  during  a  climb  is  very  difficult  since 
the  agent  must  constantly  monitor  the  altitude  of 
the  other  vehicle  and  reset  its  desired  altitude  ac¬ 
cordingly,  rather  than  simply  climbing  until  the 
other  vehicle  levels  off.  To  more  accurately  model 
the  vehicle  control  available  to  a  pilot,  the  SMI 
will  need  to  access  lower-level  ModSAF  libraries 
whicli  more  closely  correspond  to  cockpit  con¬ 
trols. 

The  majority  of  the  cockpit  functionality 
is  already  provided  by  ModSAF  in  other  li¬ 
braries.  Examples  include  the  radar  screen,  mis¬ 
sile  launching,  and  detection  of  visual  objects. 
These  components  are  relatively  straight-forward 
to  access,  requiring  only  the  translation  of  units, 
reformatting,  zmd  reorganization  of  data.  How¬ 
ever,  the  missing  cockpit  components  require  the 
development  of  completely  new  functionality.  A 
radar  warning  receiver  and  a  radio  device  for 
inter-agent  communication  are  examples.  These 
additional  components  use  ModSAF  libraries  at 
a  low-level,  if  at  all.  Much  more  design  and  de¬ 
velopment  is  required  for  such  enhancements. 

Communication  between  Soar  and 
ModSAF 

The  SMI  design  must  be  efficient  and  modular. 
Soar  and  ModSAF  are  designed  as  stand-alone 
systems  and  each  system  is  designed  to  be  the 
primary  process  running  -  not  needing  to  func¬ 
tion  with  other  large  processes.  While  these 
could  be  run  as  separate  processes,  Soar,  Mod¬ 
SAF,  and  the  SMI  are  incorporated  into  a  sin¬ 
gle  process  to  reduce  communication  overhead 
and  increase  overall  system  throughput.  There 
is  no  need  to  encode  and  decode  over  a  more 
general  mechanism  such  as  Unix  sockets.  This 
also  enables  high-bandwidth  communication  be¬ 
tween  Soar  and  ModSAF  to  be  made  more  ef¬ 
ficiently.  Since  both  systems  have  a  scheduler 
but  one  system  must  be  in  control  of  the  primary 
scheduling,  it  was  decided  that  ModSAF  should 
call  upon  Soar  at  the  appropriate  times.  This 
is  natural  since  ModSAF  controls  the  simulated 
“world”  and  Soar  agents  are  agents  in  that  world. 

The  incorporation  of  Soar,  ModSAF,  and  the 
SMI  into  one  process  was  fairly  easy  since  all  are 
written  in  the  C  language  and  utilize  user-defined 


C  libraries.  The  communication  overhead  is  re¬ 
duced  by  handling  all  I/O  data  flow  through  C 
function  calls.  Input  to  Soar  systems  takes  the 
form  of  adding  WMEs  to  Soar’s  memory.  Output 
is  carried  out  through  placing  WMEs  in  specific 
parts  of  the  memory.  Tb  ease  the  task  of  adding 
Soar  input  working  memory  elements,  an  exist¬ 
ing  package  was  used  that  provides  a  convenient 
API  to  manage  input  working  memory  element 
retractions  and  assertions  [SoarSIM]. 

For  eflSciency,  the  decision  was  made  to  only 
pass  integer  data  values  to  Soar  even  though 
ModSAF  calculated  some  data  values  using  float¬ 
ing  point  numbers.  The  Soar  input  values  were 
marked  as  being  changed  based  on  the  rounded 
off  values.  This  greatly  reduced  the  number  of 
memory  updates  needed  during  each  Soar  cycle. 
For  small  read  number  data  items,  the  values  were 
scaled  into  a  larger  integer  range. 

Until  the  advent  of  this  project.  Soar  had  been 
designed  to  support  just  one  agent  per  process. 
Since  Soar  and  ModSAF  were  to  be  run  as  a  sin¬ 
gle  process,  the  Soar  system  had  to  be  modified 
to  support  multiple  independent  agents.  Soar  was 
generalized  to  allow  the  dynamic  creation  and  de¬ 
struction  of  agents,  each  operating  with  indepen¬ 
dent  >nemories  and  I/O  channels.  There  was  no 
definitive  critieria  for  defining  an  inter-agent  com¬ 
munication  mechanism,  so  none  was  created. 

Functionality  of  Soar,  ModSAF ,  and 
the  SMI 

Soar  and  ModSAF  are  very  diflerent  systems. 
Each  has  certain  capabilities  that  the  other  lacks 
because  they  were  designed  to  diflerent  ends. 
When  deciding  where  to  implement  certain  func¬ 
tionality,  in  Soar,  ModSAF,  or  the  SMI,  the 
strengths  of  the  systems  were  the  determining  fac¬ 
tors. 

Ideally,  ModSAF  would  be  responsible  for  all 
vehicle  and  environment  simulation  and  network 
interfacing,  thus  representing  the  aircraft  and  the 
world.  Soair  would  be  responsible  for  interpret¬ 
ing  the  world  and  controlling  the  plane,  as  a  hu¬ 
man  pilot  does.  Such  a  clean  separation  is  not 
possible.  ModSAF  provides  a  method  for  con¬ 
trolling  vehicles  call^  tasks.  A  number  of  tasks 
with  diflerent  priorities  can  be  assigned  to  a  ve¬ 
hicle.  The  behavior  of  a  vehicle  is  the  result  of 
the  action  of  these  tasks.  Soar  does  not  use  these 
ModSAF  tasks  since  a  Soar  agent  typically  de¬ 
liberates  about  such  things.  The  separation  of 
vehicle  simulation  and  tasks  in  ModSAF  is  not 
perfect,  so  the  SMI  fills  in  the  gaps  to  provide 
a  cockpit  simulation  to  Soar  agents.  The  Soar 
agent,  for  instance,  sets  the  desired  altitude  and 
speed  of  the  vehicle.  This  means  that  some  of  the 
functionality  provided  in  tasks  must  be  recreated 
in  the  SMI.  This  is  due  to  the  fact  that  there  is 


no  convenient  way  to  use  tlie  functionality  of  the 
tasks  without  committing  to  use  of  more  Mod¬ 
SAF  machinery. 

ModSAF  has  a  library  which  provides  facilities 
for  editing  various  data  structures.  These  graph¬ 
ical  editors  ue  used  for  such  activities  as  creat¬ 
ing  vehicles  and  specifying  missions  to  ModSAF 
vehicles.  The  Soar  agents,  however,  use  a  repre¬ 
sentation  of  missions  diflerent  from  that  provided 
by  ModSAF.  Therefore,  the  library  that  imple¬ 
ments  the  editor  and  one  that  uses  it  were  modi¬ 
fied  so  that  Soar-compatible  missions  can  be  cre¬ 
ated,  saved,  and  modified.  The  modification  of 
the  ModSAF  libraries  had  a  number  of  advan¬ 
tages  over  writing  an  entirely  new  editor.  First, 
the  ModSAF  editor  has  sub-modules  defined  for 
editing  various  data  types,  such  as  angles,  speeds, 
and  map  locations.  These  sub-modules  are  used 
by  the  Soar  mission  editor.  Second,  modifying 
the  editor  library  required  mucli  less  time  than 
would  have  been  required  to  create  a  new  edi¬ 
tor  from  scratch.  Even  the  time  required  to  make 
these  additional  changes  with  each  future  releases 
of  ModSAF  is  minor  compared  to  the  saved  de¬ 
velopment  time.  Finally,  as  screen  area  is  at  a 
premium,  reusing  area  that  is  already  allocated 
benefits  the  user. 

When  Soar  agents  must  communicate  with  one 
another,  they  must  use  some  medium,  outside 
their  1/0  channels,  just  as  humans  do.  In  the 
air-to-air  domain,  inter-agent  communication  is 
carried  out  over  radios.  The  generic  radio  inter¬ 
face  of  ModSAF*  provides  an  implementation  of 
this  form  of  communication.  Matural  language 
character  strings  are  sent  in  DIS  Radio  PDUs. 
Messages  are  generated  by  Soar  as  lists  of  WMEs 
(one  per  word)  which  are  then  turned  into  charac¬ 
ter  strings  by  the  SMI  and  passed  to  the  ModSAF 
radio. 

Soar  and  ModSAF  do  conflict  in  one  area. 
ModSAF  is  a  distributed  simulation  which  causes 
problems  when  agents  are  created  on  separate 
hosts.  When  an  agent  is  created,  a  user  inter¬ 
face  is  created  for  that  agent,  whether  it  be  a  new 
X  window  or  a  new  I/O  stream  interleaved  onto 
standard  input/output  (used  by  the  Soar  Devel¬ 
opment  Environment  (SDE)  [SDE]).  This  is  no 
problem  when  one  ModSAF  is  running  on  a  lo¬ 
cal  host.  However,  if  more  than  one  ModSAF  is 
running  and  ModSAF’s  load  balancing  is  active, 
then  locally  created  agents  will  be  simulated  on 
remote  hosts  and  their  user  interface  will  appear 
remotely.  Fortunately,  there  are  simple  methods 

*The  generic  radio  library  was  added  to  ModSAF 
in  version  1.0.  Prior  to  this,  interagent  communica¬ 
tion  was  performed  using  Message  PDUs  that  were 
generated  and  interpreted  by  the  SMI  but  sent  and 
received  by  ModSAF 
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for  forcing  agents  to  remain  on  a  local  host.  In  the 
long  term,  it  would  be  useful  to  find  a  method  for 
allowing  load  balancing  without  interfering  witti 
the  placement  of  the  user  interface.  A  more  diffi¬ 
cult  issue  in  regards  to  load  balancing  is  the  mov¬ 
ing  of  complex  reasoning  agents,  such  as  the  ones 
built  in  S<^.  There  is  no  simple  mechanism  to 
transfer  both  the  complex  reasoning  state  and  the 
knowledge  used  in  that  reasoning  to  another  ma¬ 
chine,  while  the  agent  is  interacting  in  the  simu¬ 
lated  world. 

Implementation  Details 

The  SMI  must  honor  several  design  constraints. 
Although  the  primary  focus  is  on  the  automated 
pilot  which  controls  a  single  aircraft,  there  may 
be  additional  agents  associated  with  a  vehicle.  A 
fighter  aircraft  may  have  a  Radar  Intercept  Of¬ 
ficer  (RIO)  and  an  Air  Intercept  Control  (AIC) 
aircrsdt  may  have  air  controllers.  Any  of  these 
agents  may  be  created  or  destroyed  at  any  point 
in  the  simulation;  there  is  no  preset  scenario. 

An  arbitrary  number  of  agents  may  exist  in  the 
Soar  system  and  an  arbitrary  number  of  vehicles 
may  exist  in  ModSAF.  Not  all  of  these  vehicles 
may  be  controlled  by  Soar  agents.  Some  agents 
may  be  controlled  by  other  software  modules  or 
even  by  humans.  The  number  of  such  entities  is 
limited  only  by  the  processing  speed  and  memory 
capacity  of  the  host  workstation.  The  SMI  must 
be  efficient  so  that  the  performance  of  Soar  and 
ModSAF  do  not  degrade  due  to  excessive  commu¬ 
nication  overhead  between  agents  and  their  vehi¬ 
cles. 

There  are  also  implementation  constraints  on 
the  SMI.  Both  Soar  and  ModSAF  are  designed 
as  standalone  systems  and  continue  t^  e  ongo¬ 
ing  development.  The  SMI  must  enable  new  ver¬ 
sions  of  Soar  and  ModSAF  to  be  incorporated. 
The  Soar  system  is  already  implemented  with  a 
number  of  hook  functions  and  configurable  sub¬ 
systems.  Some  of  these  facilities  were  generalized 
to  work  more  effectively  with  external  systems 
such  as  ModSAF,  but  no  changes  were  needed  to 
the  Soar  system  releases.  All  SMI  functionality 
is  incorporated  through  Soar’s  extensible  mecha¬ 
nisms.  The  SMI  redefines  Soar’s  scheduling  com¬ 
mand  since  ModSAF  is  in  charge  of  scheduling, 
and  adds  a  number  of  commands  useful  in  the 
air  combat  domain.  The  SMI  also  adds  a  set  of 
domain-specific  right-hand  side  functions  used  in 
Soar  productions. 

ModSAF  is  also  designed  with  modularity  as 
an  important  goal.  Hence,  only  one  library  out  of 
over  100  was  modified  to  incorporate  Soar  and  the 
SMI.  In  this  library,  the  SMI  is  implemented  as 
a  software  layer  connected  to  ModSAF  at  a  level 
dealing  with  the  aircraft  vehicle  simulation.  The 
SMI  calls  upon  a  number  of  ModSAF  libraries  to 


help  create  the  simulated  cockpit.  The  ModSAF 
main  program  anc  few  additionad  libraries  re¬ 
quired  minor  addit  .is  to  accommodate  the  SMI 
but  their  primary  functionality  was  not  altered. 

ModSAF  uses  Motif  and  X  for  its  graphical  user 
interface  (GUI)  as  well  as  standard  input/output 
for  its  command  line  interpreter.  Soar,  which  pre¬ 
viously  depended  on  standard  input/output,  was 
enhanced  to  include  an  X  interface.  This  envied 
Soar,  ModSAF,  and  the  SMI  to  present  GUIs  to 
the  user  while  maintaining  module  independence. 
Each  module  opens  a  separate  display  connection 
to  the  user’s  console  and  receives  a  separate  event 
stream.  This  design  has  the  drawback  that  there 
is  contention  for  screen  real-estate  due  to  a  pro¬ 
liferation  of  separate  windows. 

The  SMI  GUI  enables  the  user  to  control  the 
simulation  speed.  This  is  helpful  for  speeding  up 
the  simulation  in  “dead  spots”  or  slowing  down 
the  simulation  to  observe  at  a  finer  grain  the 
changes  in  state.  Additions  to  the  SMI  GUI  are 
planned  and  will  provide  more  dynamic  control 
over  ModSAF  and  the  SMI.  In  addition  to  the 
SMI  GUI,  the  ModSAF  GUI  was  enhanced  by 
adding  two  windows.  The  first  provides  ortho¬ 
graphic  projections  of  the  PVD  so  that  altitude 
relationships  between  vehicles  may  be  depicted 
graphically.  This  window  was  augmented  to  pro¬ 
vide  some  other  desireablc  features  missing  from 
the  ModSAF  PVD;  snail  trails  and  radar  volumes. 
Snail  trails  depict  a  history  of  vehicle  positions 
over  time  by  using  a  series  of  dots.  The  radar 
volumes  are  shown  as  fans  indicating  radar  orien¬ 
tation,  beam  height,  and  beam  width.  The  sec¬ 
ond  window  presents  vehicle  status  information 
that  is  continually  updated  to  clarify  the  status 
of  vehicle  position,  orientation,  radar  sightings, 
and  weapon  employment. 

An  alternative  interface  to  Soar  was  devel¬ 
oped  independently  which  utilizes  standard  in¬ 
put/output.  This  interface,  the  SDE,  runs  in 
Emacs.  It  removes  the  need  for  separate  win¬ 
dows  for  Soar  agents  but  forces  the  elimination 
of  the  ModSAF  command  line  interpreter.  Both 
Soar  interfaces  have  their  uses  and  Soar  develop¬ 
ers  have  not  fully  committed  to  one  or  the  other. 

Conclusion 

The  problem  of  connecting  Soar  to  ModSAF 
has  brought  some  interesting  technical  challenges. 
The  challenges  have  helped  the  Soar  system  devel¬ 
opers  to  generalize  Soar’s  extension  mechanisms 
enabling  all  Soar  users  to  benefit.  And  the  Mod¬ 
SAF  environment  has  been  an  effective  tool  en¬ 
abling  Soar  agent  developers  to  focus  more  closely 
on  modeling  human  pilots. 
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1.  Introduction 

The  goal  of  our  research  effort  is  to  develop 
generic  technology  for  intelligent  automated 
agents  in  simulation  environments.  These  agents 
are  to  behave  believably  like  humans  in  these 
environments.  In  this  context,  believability 
refers  to  the  indistinguishability  of  these  agents 
from  humans,  given  the  task  being  performed, 
its  scope,  and  the  allowable  mode(s)  of 
interaction  during  task  performance.  For 
instance,  for  a  given  simulation  task,  one 
allowable  mode  of  interaction  with  an  agent  may 
be  typewritten  questions  and  answers  on  a 
limit^  subject  matter.  Alternatively,  a  different 
allowable  mode  of  interaction  for  the  same  (or 
different)  task  may  be  speech  rather  than 
typewritten  words.  In  ail  these  cases, 
believability  implies  that  the  agent  must  be 
indistinguishable  from  a  human,  given  the 
particular  mode  of  interaction.  Such  an  agent 
technology  can  potentially  provide  virtual 
humans  for  the  multitude  of  virtual  reality 
environments  under  construction.  Its 
applications  can  be  found  in  many  fields, 
including  entertainment  [1],  education  [5, 
chapter  3],  and  training  [2]. 

To  begin  this  effort,  we  have  focused  on 
creating  specific  automated  agents  for  simulated 
tactical  air  combat.  The  automated  agents  act  as 
the  virtual  pilots  for  simulated  aircraft,  and  will 
participate  in  exercises  with  real  Navy  pilots. 
These  exercises  will  aid  in  training  Navy  pilots. 


development  of  tactics,  and  evaluation  of 
proposed  hardware.  This  is  a  non-trivial  task, 
with  many  real-world  complexities,  and  as  such 
it  offers  several  advantages.  It  pushes  research 
based  on  real-world  needs  on  topics  such  as 
reactivity,  real-time  reasoning,  planning, 
episodic  memory,  agent  modeling,  temporal 
reasoning,  explanation,  and  natural  language 
understanding/generation.  Furthermore,  it 
forces  the  integration  of  all  of  these  component 
AI  technologies,  because  it  requires  a  single 
automated  agent  to  perform  all  of  the  functions 
performed  by  a  pilot  in  air  combat. 
Simultaneously,  however,  as  a  simulation  task,  it 
delimits  the  component  technologies  to  be 
integrated.  For  example,  it  does  not  force  the 
integration  of  vision  or  locomotion  components. 
Finally,  the  task  also  imposes  external  metrics 
for  success. 

The  task  also  poses  an  important  constraint: 
the  automated  agents  must  believably  act  and 
react  like  trained  human  pilots.  These  agents  are 
to  take  part  in  exercises  with  other  human  pilots. 
If  human  trainees  identify  our  agents  as 
automated  pilots,  they  may  take  advantage  of 
specific  known  characteristics  of  their  behavior. 
Training  in  such  a  situation  could  actually  be 
harmful.  For  instance,  if  the  automated  agents 
do  not  react  as  quickly  as  other  human  pilots  (or 
react  too  quickly),  trainees  may  learn  to  act  too 
aggressively  (or  not  aggressively  enough)  in  a 
real  aerial  combat.  Additionally,  if  the  agents 
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behave  unrealistically,  observers  and  tacticians 
at  "ground  control"  (who  can  watch  the 
simulated  combat  from  different  perspectives), 
may  not  be  able  to  develop  realistic  tactics  and 
strategies. 

Thus,  this  task  requires  the  development  of 
believable  automated  pilots.  For  this  fixed  task, 
believability  refers  to  the  indistinguishability  of 
the  automated  pilot  from  a  human  pilot,  given 
the  scope  of  the  task,  and  the  allowable  modes 
of  interaction.  The  scope  of  the  task  depends  on 
(at  least)  the  number  of  aircraft  involved  on  each 
side,  e.g.,  whether  it  is  a  one  "friendly"  aircraft 
versus  one  "enemy"  aircraft  (Ivl)  air-combat 
situation,  or  a  2vl,  or  2vN  situation.  The 
allowable  modes  of  interaction  depend  on 
whether  it  is  a  Beyond  Visual  Range  (BVR) 
combat  situation,  where  pilots  only  get  radar 
information  about  the  enemy  aircraft,  or  Within 
Visual  Range  (WVR)  combat  situation,  where 
the  pilots  can  also  directly  see  the  enemy 
aircr^.  In  2vl  (or  2vN)  combat  situations, 
additional  modes  of  interaction  are  possible;  the 
pilots  of  two  or  more  "friendly"  aircraft  may 
communicate  via  radios,  electronic  data  links,  or 
even  by  executing  simple  maneuvers.  A  human 
observer  at  "ground  control"  adds  even  more 
modes  of  interaction.  He/she  can  observe  the 
combat  in  progress  on  a  TV  monitor,  zoom  in 
and  out  on  it,  focus  on  the  maneuvers  of  a 
particular  aircraft,  and  so  on.  A  passive  observer 
can  only  observe  the  combat  in  progress,  while 
an  active  observer  can  supply  the  pilots  new 
information  or  commands  over  the  radio. 

The  specific  scope  of  the  task,  together  with 
the  choice  of  certain  modes  of  interaction, 
dictates  the  capabilities  an  agent  must  possess 
for  believability.  These  capabilities  define  a 
certain  level  of  believability.  If  the  agent 
possesses  these  capabilities,  then  we  refer  to  it  as 
having  (or  being  at)  this  level  of  believability. 
For  instance,  consider  a  Ivl  BVR  air-combat 
situation,  with  no  observers,  and  with  a  single 
human  pilot  engaged  in  combat  with  a  single 
automated  agent.  The  only  mode  of  interaction 
is  what  the  human  pilot  can  view  of  the 
automated  agent’s  actions  on  its  radar.  The 
capabilities  required  for  believability  are  that 
these  actions  must  appear  like  those  of  a  trained 
human  pilot.  An  agent  with  these  capabilities 
has  a  certain  (moderate)  level  of  believability. 
Suppose  we  add  a  passive  observer  to  this 


situation.  Since  the  observer  can  watch  the 
automated  agent’s  actions  much  more  closely, 
the  agent  must  have  a  higher  level  of 
believability.  As  we  add  more  aircraft,  an  active 
observer,  and  switch  to  WVR,  the  agent  must 
have  even  higher  levels  of  believability,  with 
requirements  for  capabilities  such  as  natural 
language  (and  speech)  understanding/generation 
to  support  different  types  of  radio 
communication. 

The  levels  of  believability  provide  us  a  means 
of  staging  an  attack  on  this  problem  (and 
correspondingly  staging  the  system  development 
effort).  Thus,  to  begin  this  effort,  we  have 
focused  on  an  agent  at  a  moderate  level  of 
believability;  an  agent  for  Ivl  BVR  air-combat, 
with  a  passive  observer.  Even  at  this  level,  the 
task  remains  highly  knowledge-  and  capability¬ 
intensive.  Trained  Navy  pilots  possess  vast 
knowledge  about  different  mission  types,  tactics 
and  maneuvers,  performance  characteristics  of 
the  aircraft,  radar  modes,  missile  types  and  so 
on.  The  challenge  for  constructing  an  automated 
agent  is  then  to  integrate  this  knowledge  into  a 
single  system,  along  with  the  following 
capabilities; 

1 .  The  agent  must  be  extremely  flexible  in  its 
behavior:  Situations  in  air  combat  can 
change  very  rapidly.  Unexpected  events 
can  occur,  e.g.,  an  on-target  missile  may 
fail  to  explode,  or  an  aggressive  adversary 
may  engage  in  some  preemptive  action 
disrupting  an  ongoing  maneuver. 
Accordingly,  the  agent  must  respond 
flexibly  to  the  evolving  situation. 

2.  The  ageni  must  act/react  in  real-time: 
Since  a  human  may  be  interacting  with  the 
agent  in  real-time,  the  agent  must  act/react 
in  real-time  as  well. 

3.  The  agent  must  try  to  interleave  multiple 
high-level  goals:  For  this  task,  the  agent 
must  continuously  attend  to  at  least  three 
high-level  goals;  (a)  executing  maneuvers 
to  destroy  the  opponent;  (b)  surviving 
opponents’  weapon  firings;  and  (c) 
interpreting  opponents’  actions.  Given  the 
need  for  real-time  response,  the  agent  must 
be  capable  of  rapidly  switching  among 
these  goals  (or  achieving  them  in  parallel). 

4.  The  agent  must  conform  to  human  reaction 
times  and  other  human  limitations:  As 
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discussed  earlier,  the  agent  must  not  react 
to  input  data  faster  (or  slower)  than  a 
human  pilot  would.  The  agent  must  also  not 
maneuver  the  simulated  aircraft  like  a 
"superhuman",  e.g.,  it  must  not  make  very 
sharp  turns.  Finally,  the  agent  must  exhibit 
some  unpredictability  in  its  behavior,  when 
appropriate. 

5.  Others:  Some  other  capabilities  such  as 
planning,  temporal  reasoning,  are  also 
required  for  this  task  in  limited  proportions. 

Note  that,  because  a  passive  observer  can 
watch  an  automated  agent  more  closely  than 
what  is  visible  on  radar,  this  additional  level  of 
believability  requires  more  accurate  modeling  of 
human  reaction  time  and  physical  limitations. 


2.  Developing  Believable  Pilot  Agents 

The  basis  of  our  work  on  developing 
automated  agents  is  the  Soar  integrated 
architecture  [4, 6]  (Due  to  space  constraints,  we 
will  assume  that  the  reader  has  some  familiarity 
with  the  Soar  architecture).  Some  of  the 
characteristics  of  this  task  are  particularly  well- 
suited  for  Soar.  First,  Soar  is  a  single  unified 
architecture  for  the  research,  development  and 
integration  of  various  component  AI 
technologies.  Second,  Soar  represents  a 
developing  unified  theory  of  cognition,  which  is 
advantageous,  given  the  constraint  of 
psychological  verisimilitude  (e.g.,  limitation  on 
reaction  time)  in  this  task. 

The  automated  pilots  for  the  Ivl  BVR  air- 
combat  task  are  based  on  TacAir-Soar,  a  system 
developed  within  the  Soar  architecture,  which 
currently  includes  about  1100  productions. 
TacAir-Soar  encodes  the  basic  task  knowledge 
for  an  agent  in  a  set  of  problem  spaces.  A 
particular  automated  agent  is  realized  by 
initializing  TacAir-Soar  with  a  specific  set  of 
parameters,  such  as  its  mission,  the  level  of  risk 
it  can  take  for  the  mission,  and  the  kind  of 
weapons  it  has  available. 

The  current  design  of  TacAir-Soar  is  guided 
by  two  sets  of  constraints:  the  task  requirements 
(as  specified  by  the  targeted  level  of 
believ^ility),  and  the  Soar  architecture  itself. 
Consider  the  key  requirement  of  flexibility  of 
behavior.  This  has  turned  out  to  be  a  strong 
constraint  on  the  design  of  problem  spaces  and 


operators.  For  instance,  any  maneuver 
consisting  of  a  sequence  of  actions  is 
implemented  not  as  a  single  monolithic  plan,  but 
rather  as  a  sequence  of  appropriately 
conditioned  operators  in  a  problem-space.  This 
allows  TacAir-Soar  to  respond  flexibly  to  an 
evolving  situation,  and  not  remain  rigidly 
committed  to  a  specific  plan.  Furthermore,  this 
constraint  discourages  highly  specific,  narrowly 
focused  problem  spaces.  For  instance,  a 
problem  space  devoted  solely  to  employing  one 
type  of  missile  may  not  allow  the  system  to 
switch  quickly  to  employing  a  different  type  of 
missile,  as  the  situation  rapidly  evolves.  In 
contrast,  a  problem  space  that  combines  the 
operators  for  employing  different  types  of 
missiles  facilitates  such  actions. 

TacAir-Soar’s  highly  reactive  behavior 
derives  at  least  in  part  from  Soar’s  ability  to 
react  at  a  number  of  different  levels  [3]. 
Specifically,  Soar  can  respond  to  new  inputs  at 
three  levels:  (i)  in  a  single  production  firing,  (ii) 
in  a  single  decision,  which  involves  firing 
multiple  productions,  or  (iii)  in  a  problem-space, 
which  involves  executing  multiple  decisions. 
Thus,  as  the  situation  changes.  Soar  can  respond 
very  quickly  within  the  time-span  of  a  single 
production  firing.  If  needed,  it  may  also  respond 
after  much  deliberation  in  a  problem  space. 
Additionally,  Soar’s  efficient  implementation 
technology  plays  a  large  role  in  allowing  it  to 
respond  in  real  time. 

In  achieving  multiple  high-level  goals, 
TacAir-Soar  faces  an  interesting  issue:  as 
limited  by  the  Soar  architecture,  it  cannot 
construct  multiple  goal/problem-space 
hierarchies  (in  parallel)  in  service  of  the  high- 
level  goals.  TacAir-Soar  can  and  does  construct 
a  goal  hierarchy  in  an  attempt  to  achieve  the 
high-level  goal  of  destroying  the  opponent.  For 
instance,  to  achieve  the  goal  of  destroying  the 
opponent  it  creates  a  subgoal  to  "desiroy-with- 
missile".  To  achieve  destroy-with-missile,  it 
generates  subgoals  to  get  into  missile  firing 
range,  and  so  on.  However,  TacAir-Soar  cannot 
construct  goal-hierarchies  for  its  remaining  high- 
level  goals  —  survival  and  interpretation  of 
opponent  actions  —  in  parallel.  To  address  this 
limitation,  TacAir-Soar  opportunistically  installs 
operators  for  these  high-level  goals  into  its 
existing  goal  hierarchy  (without  eliminating  the 
hierarchy).  This  avoids  the  overhead  of 
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rebuilding  the  goal  hierarchy,  while  allowing  it 
to  switch  attention  among  different  types  of 
goals  rapidly.  While  this  solution  has  allowed 
TacAir-Soar  to  exhibit  reasonable  performance 
so  far,  it  does  have  some  disadvantages.  First,  by 
not  representing  the  different  goal  hierarchies 
explicitly,  the  solution  does  hinder  TacAir- 
Soar’s  ability  to  reason  about  the  interactions 
between  multiple  goals.  Second,  it  is  unclear  if 
the  scheme  will  generalize  beyond  the  targeted 
level  of  believability.  For  instance,  it  is  unclear 
if  natural  language  understanding/generation 
will  fit  into  this  scheme.  Alternative  solutions 
are  currently  under  investigation. 

TacAir-Soar's  ability  to  adhere  to  human 
reaction  times  is  hindered  by  the  artificiality  of 
the  interface  to  the  simulation  environment.  In 
particular,  TacAir-Soar  does  not  spend  time 
physically  manipulating  different  instruments 
(e.g.,  turning  a  knob),  or  decoding  actual 
instrument  displays  (e.g.,  decoding  radar 
displays).  As  a  result,  TacAir-Soar  tends  to  react 
faster  than  human  pilots  in  some  situations. 
Therefore,  deliberate  delays  have  been  set  up  to 
slow  down  some  of  TacAir-Soar’s  responses. 
Similarly,  TacAir-Soar’s  turning  maneuvers 
have  b«n  constrained  so  as  not  to  exceed 
human  capability.  As  for  unpredictability,  much 
of  it  occurs  "naturally”  in  TacAir-Soar.  In 
particular,  while  two  complex  situations  may 
appear  very  similar  to  a  human  observer,  they 
may  be  quite  different  from  TacAir-Soar’s 
perspective,  leading  TacAir-Soar  to  two 
different  actions.  To  add  to  this  unpredictability, 
TacAir-Soar  does  random  selection  among 
operators  that  are  considered  to  be  equally 
appropriate  in  a  given  situation. 


3.  Current  Status  and  Future  Plans 
Currently,  even  with  approximately  1 100 
productions,  the  TacAir-Soar  system  continues 
to  perform  well  within  real-time  constraints.- 
Agents  based  on  TacAir-Soar  are  fairly  capable 
and  robust  within  a  narrow  range  of  missions  for 
Ivl  BVR  combat.  Recently,  in  a  demonstration 
organized  for  Navy  personnel,  these  agents  were 
tested  against  (constrained)  human  pilots.  The 
demonstration  was  a  success  in  that  the  agents 
were  able  to  function  adequately  at  this  targeted 
level  of  believability,  i.e.,  they  were  able  to  react 
realistically  to  the  humans  pilots. 


We  are  currently  extending  TacAir-Soar  to 
deal  with  co-ordinated  multi-aircraft  air-combat 
simulations.  Essentially,  we  are  extending 
TacAir-Soar  agents  to  higher  levels  of 
believability,  and  hence  need  integration  of 
capabilities  such  as  natural  language 
understanding/generation.  Thus,  so  far,  for  this 
task,  the  levels  of  believability  appear  to  be 
useful  as  a  means  of  staging  development,  as 
well  as  for  measuring  believability.  Whether  this 
usefulness  will  continue  in  the  future,  or  for 
other  tasks,  remains  to  be  seen. 
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Abstract 

The  Soar-IFOR  project  is  aimed  at  developing 
intelligent  automated  pilots  for  simulated  tactical 
air-combat.  One  key  requirement  for  an  automated 
pilot  in  this  environment  is  event  tracking:  the 
ability  to  monitor  or  track  events  instigated  by 
opponents,  so  as  to  respond  to  them  appropriately. 
Th^  events  include  the  opponents*  low  level 
actions,  which  the  automated  pilot  may  directly 
observe,  as  well  as  opponents’  high  level  plans  and 
actions,  which  the  automated  pilot  can  not  observe 
(but  only  infer).  This  paper  analyzes  the  challenges 
Uiat  an  automated  pilots  must  face  when  tracking 
events  in  this  environment.  This  analysis  reveals 
some  novel  constraints  on  event  tracking  that  arise 
from  the  dynamic  multi-agent  interactions  in  this 
environment  In  previous  work  on  event  tracking, 
which  is  primarily  based  on  single-agent 
environments,  these  constraints  have  not  been 
addressed.  This  paper  proposes  one  solution  for 
event  tracking  that  appears  better  suited  for 
addressing  these  constraints.  The  solution  is 
demonstrated  via  a  simple  re-implementation  of  an 
existing  automated  pilot  agent  for  air-combat 
simulation*. 


1.  Introduction 

The  Soar-IFOR  project  is  aimed  at  developing 
intelligent  automated  pilots  for  simulated  tactical 
air-combat  environments  [11, 17].  These 
automated  pilots  are  intended  to  participate  in 
large-scale  exercises  with  a  variety  of  human 
participants,  including  human  fighter  pilots.  These 
exercises  are  to  be  used  for  training  as  well  as  for 
development  of  tactics.  To  participate  in  such 
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exercises,  the  automated  pilots  must  act  in  a 
realistic  nuumer,  i.e.,  like  trained  human  pilots. 
Otherwise,  both  the  training  and  tactics 
development  in  these  environments  will  not  be 
realistic. 

To  act  in  a  realistic  manner,  an  automated  pilot 
must,  among  other  things,  be  responsive  to  events 
in  its  environment  —  it  must  modify  and  adapt  its 
own  maneuvers  in  response  to  relevant  events. 
These  events  may  correspond  to  simple  actions  of 
other  pilots,  such  as  changes  in  heading  or  altitude, 
which  the  automated  pilot  may  directly  observe  on 
its  radar.  Alternatively,  these  events  may  involve 
the  execution  of  complex,  high-level  actions  or 
plans  of  other  pilots,  which  the  automated  pilot  can 
not  directly  observe.  For  instance,  one  crucial 
event  is  an  opponent’s  firing  a  missile  at  an 
automated  pilot’s  aircraft,  threatening  its  very 
survival.  Yet,  the  automated  pilot  cannot  directly 
see  the  missile  until  it  is  too  late  to  evade  it. 
Fortunately,  the  automated  pilot  can  monitor  the 
opponent’s  sequence  of  maneuvers,  and  infer  the 
possibility  of  a  missile  firing  based  on  them,  as 
shown  in  Figure  1.  The  automated  pilot  is  in  the 
dark-shaded  aircraft,  and  its  opponent  is  in  the 
light-shaded  one. 


(•)  (b)  (e) 


M)  (e) 

Figure  1:  Manuevers  of  the  automated  pilot  (in  dark-shaded 
aircraft)  and  its  opponent  (in  light-shaded  one). 

Suppose  that  initially  the  two  aircraft  are  headed 
right  toward  each  other  as  shown  in  Figure  1-a. 
The  range  (distance)  between  the  two  aircraft  is 


more  than  10- IS  miles,  so  they  can  only  see  each 
other  on  radar.  This  range  is  slightly  short  of  the 
range  from  which  the  opponent  can  fire  a  radar- 
guided  missile  at  the  automated  pilot's  aircraft. 
However,  the  opponent  is  already  well-positioned 
to  fire  this  missile  once  its  range  is  reached.  In 
particular,  given  that  the  two  aircraft  ate  pointing 
right  at  each  other,  the  opponent’s  aircraft  is  at 
attack  heading  (a  point  slightly  in  front  of  the 
automated  agent’s  aircraft,  as  shown  by  a  small  x 
in  the  figure).  At  this  juncture,  the  automated  pilot 
turns  its  airoaft  as  shown  in  Figure  I-b.  Given  that 
the  opponent  wants  to  fire  a  missile,  she  turns  her 
aircraft  in  response  to  re-orient  it  to  attack  heading 
(iMgure  1-c).  In  this  situation,  she  reaches  her 
missile  ftring  range,  and  fires  a  missile  (shown  by 
-).  While  the  automated  agent  cannot  observe  this 
missile,  based  on  the  opponent’s  turn  it  can  infer 
that  the  opponent  may  be  attempting  to  achieve 
attack  heading  as  part  of  her  missile  firing 
behavior.  Unfortunately,  at  this  point,  it  cannot  be 
certain  about  the  opponent’s  missile  firing,  at  least 
not  to  an  extent  where  trained  ftghter  pilots  would 
infer  a  missile  firing.  However,  if  the  opponent 
subsequently  engages  in  an  Fpole  maneuver  then 
that  considerably  increases  the  likelihood  of  a 
missile  firing  (Figure  1-d).  This  maneuver 
involves  a  2S-S0  degree  turn  away  from  the  attack 
heading  (it  is  executed  after  firing  a  missile  to 
provide  radar  guidance  to  the  missile,  while 
reducing  the  closure  between  the  two  aircraft). 
While  at  this  point  the  opponent’s  missile  firing  is 
still  not  an  absolute  certainty,  its  likelihood  is  high 
enough,  so  that  trained  fighter  pilots  assume  the 
worst,  and  react  as  though  a  missile  has  actually 
been  fired.  The  .automated  pilot  reacts  in  a  similar 
manner,  by  engaging  in  a  missile-evasion 
maneuver.  This  involves  turning  the  aircraft 
roughly  perpendicular  to  the  missile-flight  (Figure 
I-e),  which  causes  the  aircraft  to  "drop-off 
(become  invisible  to)  the  opponent’s  radar. 
Deprived  of  radar  guidance,  the  opponent’s  missile 
is  tendered  harmless. 

The  above  example  illustrates  that  an  automated 
pilot  rreeds  to  continually  monitor  a  variety  of 
events  in  its  environment,  such  as  the  opponent’s 
turns  and  her  (inferred)  missile-firing  behavior,  so 
as  to  react  to  them  appropriately.  We  refer  to  this 
capability  as  event  tracking.  Here,  an  event  may  be 
considered  as  any  coherent  activity  over  an 
interval  of  time.  An  event  is  similar  to  a  process  in 
qualitatVe  process  theory  [8],  as  something  that 
acts  through  time  to  change  the  parameters  of 
objects  in  a  situation.  This  event  may  be  a  low- 
level  action,  such  as  an  agent’s  Fpole  turn,  or  it 
may  be  a  high-level  behavior,  such  as  its  missile¬ 


firing  behavior,  which  consists  of  a  sequence  of 
such  turns.  The  event  may  be  internal  to  an  agent, 
such  as  maintaining  a  goal  or  executing  a  plan,  or 
external  to  it,  such  as  executing  an  action.  The 
event  may  be  instigated  by  any  of  the  agents  in  the 
environment,  including  the  agent  tracking  the 
events,  or  by  none  of  them  (e.g.,  a  lightning  bolt). 
The  event  may  be  observed  by  an  agent,  perhaps 
on  radar,  or  it  may  be  unobserved,  but  inferred. 
Tracking  any  one  of  these  events  refers  to 
recording  it  in  memory  and  monitoring  its  progress 
as  long  as  necessary  to  take  appropriate  action  in 
response  to  it.  Tracking  an  event  also  includes  the 
ability  to  infer  the  occurrence  of  that  event  from 
other  events. 

Event  tracking  is  closely  related  to  the  problem 
of  plan  recognition  [\2],  the  process  of  inferring  an 
agent’s  plan  based  on  observations  of  the  agent’s 
actions.  The  term  event  tracking  is  preferred  in 
this  investigation,  since  it  also  involves  events 
other  than  plans,  and  since  it  is  a  continuous  on¬ 
going  activity.  However,  more  important  than  the 
terminology,  of  course,  is  gaining  a  better 
understanding  of  the  nature  of  this  capability.  In 
particular,  does  the  realistic  multi-agent  setting  of 
air-combat  simulation  reveal  anything  new  about 
event  tracking?  Given  the  complexity  of  this 
domain,  answering  this  question  in  its  entirety  is 
beyond  the  scope  of  this  single  investigation. 
However,  this  paper  takes  a  first  step  by  focusing 
on  evenu  relating  to  the  actions  and  behaviors  of 
one  or  two  opponents  as  they  confront  the 
automated  pilot.  Section  2  illustrates  that  even 
within  this  restricted  context,  the  air-combat 
domain  brings  forth  some  novel  constraints  on 
event  tracking.  Following  this.  Section  3  presents 
one  approach  that  we  have  been  investigating  to 
address  these  constraints.  The  key  idea  in  this 
solution  is  a  basic  shift  in  the  agent’s  reasoning 
framework;  from  the  usual  agent-centric  to  world¬ 
centric.  Finally,  Section  4  presents  a  summary  and 
issues  for  future  woik. 


2.  Event  Tracking  in  Air-Combat 
Simulation 

The  primary  constraint  on  event  tracking  in  air- 
combat  simulation  arises  from  the  fact  that  this  is  a 
dynamic  environment,  where  agents  continually 
interact.  This  continuous  interaction  implies  that 
the  agents  cannot  rigidly  commit  to  performing  a 
fixed  sequence  of  actions.  Instead,  they  need  high 
behavioral  flexibility  and  reactivity  in  order  to 
achieve  their  goals.  For  instance,  in  I^gure  1-c,  the 
opponent  has  to  re-orient  herself  to  a  new  attack 
heading  in  response  to  the  automated  pilot’s  turn  in 
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Figure  1-b.  If  the  automated  pilot  had  turned  in 
the  opposite  direction,  so  would  have  the 
opponent.  A  more  complex  interaction  occurs  in 
Figure  1-e,  where  the  automated  pilot’s  missile 
evasion  maneuver  is  a  response  to  the  opponent’s 
overall  maneuvers  in  Figures  1-c  and  I-d,  which 
are  identified  as  part  of  her  missile  firing  behavior. 

These  types  of  agent  interactions  extend  well 
beyond  situations  involving  just  two  aircraft.  For 
instance,  consider  a  situation  where  there  are  two 
opponents  attacking  the  automated  pilot’s  aircrift, 
as  shown  in  Figure  2-a.  Again,  the  automated  pilot 
is  in  the  dark-shaded  aircraft,  and  the  opponents 
are  in  the  light-shaded  aircraft.  These  opponents 
may  either  closely  co-ordinate  their  attack  or  they 
may  attack  independently.  One  method  of  close 
co-ordination  in  the  opponent’s  attack  is  shown  in 
Figure  2-b.  Here,  the  opponent  closer  to  the 
automated  pilot’s  aircraft  (the  Uad)  leads  the 
attack,  while  the  second  opponent,  mfcrked  with  x 
(the  wingnum)  just  stays  close  to  the  lead,  and 
follows  her  commands.  Thus,  as  the  lead  turns  to 
gain  positional  advantage,  the  wingman  needs  to 
turn  in  that  direction  as  well,  so  as  to  fly  in 
formation  with  the  lead,  all  the  while  making  sure 
that  she  does  not  get  in  between  the  lead  and  the 
automated  pilot’s  aircraft.  Another  method  of  close 
co-ordination  is  shown  in  Figure  2-c.  Here,  the 
opponents  execute  a  coordinated  pincer  maneuver 
—  as  the  lead  turns  in  one  direction,  the  wingman 
turns  in  the  opposite  direction,  so  as  to  confuse  the 
automated  pilot  and  attack  it  from  two  sides. 
There  are  other  possibilities  of  co-ordinating  the 
attack  as  well.  Of  course,  the  opponents  may  not 
co-ordinate  their  attack.  They  may  instead  try  to 
gain  positional  advantage  in  the  combat 
independently  of  each  other,  and  attack 
independently.  In  all  these  situations,  all  three 
aircraft  continually  influence  each  other’s  actions 
and  behaviors  in  different  ways.  If  other  aircraft 
are  involved  in  the  combat  —  for  instance,  if  the 
automated  pilot  is  coordinating  its  attack  with  a 
friendly  aircraft  —  then  they  also  interact  with  the 
other  aircraft  involved  in  the  combat. 

This  dynamic  interaction  among  the  agents  leads 
to  the  primary  constraint  on  event  tracking  in  this 
domain:  an  agent  must  be  able  to  track  highly 
flexible  and  reactive  behaviors  of  its  opponent.  In 
so  doing,  the  agent  must  take  the  appropriate  agent 
interaction  into  account.  Without  an  understanding 
of  this  interaction,  an  opponent’s  action  may  lead 
to  unuseful  or  even  misleading  interpretation.  For 
instance,  the  opponent’s  turn  in  Figure  1-c  needs  to 
be  tracked  as  a  response  to  the  automated  pilot’s 
own  turn  in  Figure  I-b.  Otherwise,  that  turn  may 
appear  meaningless.  Similarly,  as  shown  in  Hgure 
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Figure  2:  Agent  interactions:  (a)  two  opponents  attacking 
the  automated  pilot’s  aircraft;  (b)  opponents  stay 
close;  (c)  opponents  stage  a  co-ordinated  ’’pincer". 

2,  the  wingman  may  mainly  be  reacting  to  its 
lead’s:  turns,  or  she  may  be  reacting  to  the 
automated  pilot’s  aircraft  independently. 
Understanding  this  interaction  is  important  in 
tracking  the  wingman’s  actions. 

A  second  related  constraint  here  is  that  event¬ 
tracking  must  occur  in  real-time  and  must  not 
hinder  an  agent  from  acting  in  real-time.  For 
instance,  in  Figure  1,  if  the  automated  pilot  does 
not  track  the  missile  firing  event  in  real-time  or 
does  not  react  to  it  in  real-time,  the  results  could  be 
fatal. 

The  third  constraint  on  event  tracking  is  that 
agents  must  be  able  to  expect  the  occurrence  of 
unseen,  but  on-going  events.  This  constraint  arises 
from  the  weakness  of  the  sensors  in  this  domain  — 
an  agent  must  sometimes  track  opponent’s  actions 
even  though  they  are  not  visible  on  radar.  For 
instance,  suppose  in  the  situation  in  Figure  2-c,  the 
automated  pilot  concentrates  its  attack  on  the  lead, 
and  as  a  result  the  wingman  (marked  with  x)  drops 
off  the  automated  pilot’s  radar.  Here,  given  ^at  the 
opponents  are  inferred  to  be  executing  a  pincer 
maneuver,  even  though  the  wingman  drops  off  the 
radar,  some  expectation  about  her  position  can  be 
developed.  Thus,  the  automated  pilot  can  re-orient 
its  radar  and  reset  its  mode  to  re-establish  radar 
contact  with  the  wingman  if  there  is  a  need  to  do 
so  later  during  the  combat. 

The  fourth  and  final  constraint  on  event  tracking 
is  that  it  is  not  a  one-shot  recognition  task.  Instead, 
it  occurs  on  a  continual  basis,  at  least  as  long  as  it 
is  relevant  to  the  agent’s  achievement  of  its  goals 
(such  as  the  completion  of  its  mission). 

Thus,  this  domain  poses  a  challenging 
combination  of  constraints  for  event  tracking.  The 
most  novel  constraint  here  is  the  first  one.  In 
previous  investigations  in  the  related  areas  of 
plan/situation  recognition  (12, 16, 6,  18, 3]  — 

including  one  investigation  focused  on  plan 
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recognition  in  airborne  Uctical  decision  making  [2] 
—  this  constraint  has  not  been  addressed.  In 
particular,  plan  recognition  models  have  not  been 
applied  in  such  dynamic,  interactive  multi-agent 
situations,  and  hence  do  not  address  strong 
interactions  among  agents  and  the  resulting 
flexibility  and  reactivity  in  agent  behaviors.  In 
particular,  these  models  assume  that  a  single 
planning  agent  (or  multiple  independent  planning 
agents)  has  some  plans,  and  a  recognizing  agent 
recognizes  these  plans.  The  planning  agent  may  be 
either  actively  cooperative  (it  intends  for  its  plans 
to  be  recognized  by  the  recognizing  agent)  or 
passive  (it  is  unconcerned  about  its  plans  being 
recogniz^)  [4].  The  recognizing  agent’s  job  is  to 
recognize  these  plans  and  possibly  provide  a 
helpful  response.  However,  neither  the  recognizing 
agent,  nor  any  other  agents  in  the  environment  are 
assumed  to  have  any  influence  on  these  plans. 
Consequently,  these  plan  recognition  models  can 
rely  on  pre-compiled  plan  libraries,  where  each 
plan  lists  the  sequence  of  events  and  the  temporal 
relationships  among  the  events  [16].  However, 
such  lists  cannot  be  employed  in  tracking  highly 
flexible  and  reactive  agent  tehaviors.  In  particular, 
all  possible  variations  on  agent  behaviors  would 
need  to  be  included  in  such  lists,  leading  to  a 
combinatorial  explosion*  in  the  number  of  plans 
(unless  a  highly  expressive  plan  language  is 
developed). 

Grosz  and  Sidner  [9],  in  their  work  on  discourse 
situations,  attempt  to  partly  address  the  above 
constraint  on  event  tracking.  They  focus  on  what 
they  characterize  as  the  "master-slave"  relationship 
between  the  planning  agent  and  the  recognizing 
agent  assum^  in  plan-recognition  models,  and 
attempt  to  remedy  it  by  using  shared  plans. 
Agents  in  their  discourse  situations  arrive  at  a 
shared  plan  by  establishing  mutual  beliefs  and 
intentions  about  things  such  as  their  role  in 
executing  the  plan.  However,  their  discourse 
situations  involve  agents  that  are  actively 
cooperative,  while  agents  in  air-combat  simulation 
range  from  actively  co-operative  to  passive  to 
actively  un-cooperative. 

Interestingly,  while  plan-recognition  systems 
have  not  dedt  with  such  dynamic  multi-agent 
situations.  Distributed  AI  (DAI)  systems,  which 
have  dealt  with  such  situations,  have  not  addressed 
die  problem  of  plan  recognition.  There  is  some 
work  in  DAI  on  understanding  other  agents’ 
plans  [7].  However,  it  focuses  on  agents 
exchanging  their  plan  data  structures  for  active 
cooperation,  rather  than  on  plan  recognition.  Thus, 
the  first  constraint  actually  appears  to  give  rise  to  a 
novel  issue  intersecting  the  areas  of  plan- 


recognition  and  DAI. 

The  remaining  three  constraints  on  event 
tracking  —  real-time  performance,  expectations 
and  continuous  tracking  —  have  been  addressed  in 
previous  research  (e.g.,  in  (61).  The  next  section 
presents  an  approach  that  we  have  been 
investigating  for  event  tracking  that  addresses  all 
four  constraints  outlined  above. 


3.  Towards  a  Solution  for  Event 
Tracking 

The  key  idea  in  the  proposed  solution  for  event 
tracking  is  based  on  the  following  observation.  All 
of  the  agents  in  this  environment  possess  similar 
types  of  knowledge,  they  have  similar  goals,  and 
similar  levels  of  flexibility  and  reactivity  in  their 
behaviors.  In  particular,  an  automated  pilot  agent 
that  requires  the  capability  to  track  events  shares 
these  similarities  with  its  opponent.  Thus,  the  key 
idea  is  that  all  the  knowledge  and  implementation 
level  mechanisms  that  the  automated  pilot  agent 
uses  in  generating  its  own  flexible  behaviors  may 
be  used  in  service  of  tracking  flexible  behaviors  of 
other  agents. 

To  understand  this  idea  in  detail,  it  is  first  useful 
to  understand  how  an  agent  generates  its  own 
flexible  and  reactive  behaviors.  Section  3.1 
explains  this  by  focusing  on  an  automated  pilot 
agent  A^  and  its  flexibility  and  reactivity.  Section 
3.2  then  illustrates  how  A^,  may  exploit  this  for 
tracking  other  agent’s  behaviors.  Section  3.3 
outlines  the  issues  that  arise  in  such  an  endeavor. 
Finally,  Section  3.4  presents  a  simple  re¬ 
implementation  of  an  existing  pilot  agent  based  on 
the  ideas  presented  in  this  section. 

Note  that  while  the  solution  presented  here 
originated  with  the  observation  of  similarity 
among  agents,  it  is  not  necessarily  limited  to  only 
those  situations.  For  instance,  it  is  possible  that 
even  though  the  other  pilot  agents  may  possess 
similar  levels  of  flexibility  and  reactivity,  they  may 
be  constrained  in  their  behavior  by  their  doctrine. 
To  track  these  types  of  constrained  behaviors,  A^ 
would  need  to  use  similar  types  of  doctrine-based 
constraints  in  tracking  behaviors  of  other  agents. 


3.1.  An  Agent’s  Own  Behavior 
This  section  illustrates  how  an  automated  pilot 
agent  A^  generates  flexible  and  reactive  behavior. 
This  illustration  is  provided  using  a  concrete 
implementation  of  A^  in  Soar  [1 1, 17].  Soar  is  an 
integrated  problem-solving  and  learning 
architecture  that  is  already  well-reported  in  the 
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literature  (14,  15).  The  description  below  abstracts 
away  from  many  of  the  deuils  of  this 
implementation,  and  mainly  focuses  on  Soar’s 
problem  space  model  of  problem-solving.  Very 
briefly,  a  problem  space  consist  of  states  and 
operators.  An  agent  solves  problems  in  a  problem 
space  by  taking  steps  through  the  problem  space  to 
reach  a  goal.  A  step  in  a  problem  space  usually 
involves  applying  an  operator  in  the  problem  space 
to  a  state.  This  operator  application  changes  the 
state.  If  the  changes  are  what  are  expected  from  the 
operator  application,  then  that  operator  application 
is  terminated,  and  a  new  operator  is  applied.  If  the 
operator  does  not  change  the  state,  or  if  the 
changes  it  causes  do  not  meet  the  expectations, 
then  a  subgoal  is  created.  A  new  problem  space  is 
installed  in  the  subgoal  to  attempt  to  achieve  the 
expected  effects  of  the  operator.  (Note  that  the 
system  uses  a  procedural  representation  for  these 
operator  expectations  —  a  declarative 
representation  is  not  necessary.  In  particular,  a 
procedural  representation  is  sufficient  to  determine 
if  the  expectations  are  achieved.) 

Figure  3  illustrates  the  problem  spaces  and 
operators  A^  employs  while  it  is  trying  to  get  into 
position  to  fire  a  missile.  In  the  figure,  problem 
spaces  are  indicated  with  bold  letters,  and 
operators  being  applied  in  italics.  In  some  problem 
spaces,  alternative  operators  are  also  shown  (these 
are  not  italicized).  In  the  top-most  problem  space, 
named  TOP-PS,  A^,  is  attempting  to  execute  its 
mission  by  applying  the  execute-mission  operator. 
This  is  the  only  operator  it  has  in  this  problem 
space.  The  expected  effect  of  this  operator  is  the 
compl-'tion  of  A^’s  mission,  which  may  be  for 
example  to  protect  its  aircraft  carrier.  Since  this 
expected  effect  is  not  yet  achieved,  a  subgoal  is 
generated  to  complete  the  application  of 
execute-mission.  This  subgoal  involves  the 
EXECUTE-MISSION  problem-space.  There  are 
various  operators  available  in  this  problem  space 
to  execute  A^’s  mission,  including  intercept  (to 
intercept  an  attacking  opponent),  fly-racetrack  (to 
fly  in  a  racetrack  pattern  searching  for  opponents 
.  when  none  is  present),  etc.  In  fact,  in  most  of  A^’s 
problem  spaces  there  are  always  several  such 
options  available,  and  A^  has  to  select  a  particular 
operator  that  would  allow  it  to  make  the  most 
progress.  In  this  case,  A^  selects  the  intercept 
operator  so  as  to  intercept  the  opponent’s  aircraft. 
Given  the  presence  of  the  opponent,  this  is  the  best 
option  available. 

Ag  attempts  to  apply  the  intercept  operator. 
However,  the  expected  effect  of  this  operator  — 
the  opponent  is  either  destroyed  or  chased  away  — 
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Figure  3:  A^’s  problem  space/operator  hierarchy.  Boxes 
indicate  problem  spaces.  Text  in  italics  indicates 
currently  active  operator  within  a  problem  space. 

is  not  directly  achieved.  This  leads  to  a  subgoal 
into  the  intercept  problem  space,  where  A^ 
attempts  to  apply  the  employ-missile  operator. 
However,  the  missile  firing  range  and  position  is 
not  yet  reached.  Therefore,  A^^  subgoals  into  the 
EMPLOY-MISSILE  problem  space,  and  applies 
the  get-missile-lar  operator.  (LAR  stands  for 
iaunch-acceptability-region,  the  position  for  A^  to 
fire  a  missile  at  its  opponent).  The  get-missile-lar 
operator  results  in  the  application  of  the 
achieve-proximity  operator  in  a  subgoal.  Finally, 
this  leads  to  a  subgoal  into  the  start-turn  operator 
in  the  DESIRED-MANEUVER  problem  space. 
The  application  of  this  start-turn  operator  causes 
Ap  to  turn.  Another  operator  —  stop-turn  —  will 
be  applied  to  stop  the  aircraft’s  turn  when  it 
reaches  a  particular  heading  (called  collision- 
course).  This  heading  will  be  maintained  until 
missile  firing  position  is  reached.  At  that  time,  the 
expected  effect  of  A^j’s  get-missile-lar  operator 
will  be  achieved,  and  hence  it  will  be  terminated. 
Aq  can  then  apply  the  final-missile-maneuver 
operator  from  the  EMPLOY-WEAPONS  problem 
space.  The  final-missile-maneuver  operator  may 
lead  to  subgoals  in  other  problem  spaces,  not 
shown  in  the  figure. 

Thus,  by  subgoaling  from  one  operator  into 
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another  a  whole  operaior/problem-space  hierarchy 
is  generated.  The  state  in  each  of  these  problem 
spaces  consists  of  a  global  portion  shared  by  all  of 
the  problem  spaces  and  a  local  portion  that  is  local 
to  that  particular  problem  space.  This  organization 
supports  reactive  and  flexible  behaviors  given 
appropriate  pre-conditions  (or  conditions)  for  the 
operators,  and  the  appropriate  operator  selection 
and  termination  mechanisms,  as  outlined  in  [13]. 
In  particular,  if  the  global  state  changes  so  that  the 
expected  effects  of  any  of  the  operators  in  the 
operator  hierarchy  is  achieved,  then  that  operator 
can  be  terminated.  All  of  the  subgoals  generated 
due  to  that  operator  are  automatically  deleted.  Note 
that  Ag  may  also  terminate  an  operator  even  if  its 
expected  effects  are  not  achieved.  This  may  be 
achieved  if  another  operator  is  found  to  be  more 
appropriate  for  the  changed  situation.  For  instance, 
suppose  the  opponent  suddenly  abandons  the 
combat  and  turns  to  return  to  it  base  while  A^  is 
attempting  to  fire  a  missile  at  the  opponent  as 
shown  above.  In  this  case,  the  chase-opponent 
operator  may  be  more  appropriate  than  the 
employ-missile  operator  in  the  intercept  problem 
space.  Hence,  A^  terminates  the  employ-missile 
operator  (all  its  subgoals  get  eliminated  as  well), 
and  instead,  A^  applies  the  chase-opponent 
operator. 

Since  all  of  the  above  operators  are  used  in 
generation  of  A^’s  own  actions,  they  will  be 
henceforth  denoted  using  the  subscript  own.  For 
instance,  employ-missile^^, will  denote  the 
operator  A^  uses  in  employing  a  missile. 
Operator^^^  will  be  used  to  denote  a  generic 
operator  that  A^  uses  to  generate  its  own  actions. 
The  global  state  in  these  problem  spaces  will  be 
denoted  by  state^^„.  Problem-spaces  that  consist 
of  state^j^  and  operatorj,,^,^  will  be  referred  to  as 
self-centered  problem  spaces.  The  motivation  for 
using  this  method  for  denoting  states  operators  and 
problem  spaces  will  become  clearer  below. 

3.2.  Tracking  Other  Agent’s  Behaviors 

Given  the  similarities  between  A^  and  its 
opponent,  the  key  idea  in  our  approach  to  event 
tracking  is  to  use  A^’s  problem  space  and  operator 
hierarchy  to  track  opponent’s  behaviors.  We  will 
first  illustrate  this  idea  in  some  detail  using  some 
simplifying  assumptions.  The  detailed  issues 
involved  in  operationalizing  this  idea  will  be 
discussed  in  Section  3.3. 

To  begin  with,  let  us  assume  that  A^  and  its 
opponent  are  exactly  identical  in  terms  of  the 
knowledge  they  have  of  this  domain,  and  all  their 


other  characteristics  related  to  this  domain.  That  is, 
Aq  and  its  opponent  have  identical  problem  spaces 
and  operators  at  their  disposal  to  engage  in  the  air- 
combat  simulation  task.  This  simplifies  A^’s  event 
tracking  task,  since  it  can  essentially  use  a  copy  of 
its  own  problem-spaces  and  operators  to  track  the 
opponent’s  actions  and  behaviors.  Operators  in 
th^  problem  spaces  represent  A^’s  model  of  its 
opponent’s  operators.  These  operators  are  denoted 
using  the  subscript  opponent.  Thus,  the 
execute-mission  operator  used  in  modeling  an 
opponent’s  execution  of  her  mission  is  denoted  by 
execute-mission^^^^^.  Similarly,  operator,,pp^, 
will  be  used  to  denote  a  generic  operator  usm  by 
the  opponent. 

The  global  state  in  these  problem-spaces 
represents  A^’s  model  of  the  state  of  its  opponent, 
and  is  denoted  by  state,,  Generating 

state„ppo„ent  requires  A^  to  model  features  such  as 
the  op^nent’s  sensor  input.  Based  on  information 
such  as  the  range  of  opponent’s  sensors,  at  least  a 
portion  of  this  state  can  be  generated.  However, 
other  portions  of  state„pp„,„,„,  may  require  fairly 
complex  computation,  essentially  mirroring  the 
computation  that  A^  requires  to  generate  all  of  the 
information  -in  state„,„„.  For  instance,  one 
important  piece  of  information  that  is  computed  in 
state^^^n  is  the  "angle  off  (the  angle  between  the 
A„’s  flight  path  and  opponent’s  position). 
Mirroring  this  computation  in  state  will 

mean  the  computation  of  this  "angle  ofr"  from  the 
opponent’s  perspective  (the  angle  between  the 
opponent’s  flight  path  and  A^’s  position).  For  now, 
we  make  another  simplifying  assumption  —  that 
A„  generates  a  detailed  and  accurate  state„pp„„g„, 
—  and  revisit  this  issue  in  Section  3.3. 

TTie  problem  spaces  consisting  of  state^p 
and  operatorQpp„,,^,,i  discussed  above  are  rererred 
to  as  opponent-centered  problem  spaces.  With  the 
opponent-centered  problem  spaces.  A„  can 
essentially  pretend  to  be  the  opponent.  A„  then 
tracks  opponent’s  behaviors  and  actions  by 
pretending  to  engage  in  the  same  behaviors  and 
actions  as  the  opponent.  In  particular,  A„  applies 
operator  p„„^„,  to  state„pp„„^,.  thus  modeling  the 
opponent's  actual  application  of  her  operator  to  her 
actual  state.  Since  A^  is  modeling  the  opponent’s 
action,  op®retor„^„g„,  does  not  change 
state„ppp„g,„.  Instead/ if  the  opponent  takes  some 
action  in  the  real-world,  then  that  change  is 
modeled  as  a  change  in  state^pp^^^^,.  If  this  change 
matches  the  expected  effects  of  operator,, 
then  that  effectively  corroborates  A^’s  modeling  of 


oP®'^^0''opponenf  (Note  that  as  with  A^,’s 
operator^^n,  these  expectations  of  operatorgppQ^^, 
may  also  only  be  represented  procedurally.  This 
procedural  representation  is  sufficient  to  match  the 
expectations.)  If  these  expectations  are 
successfully  matched,  operator^j  is  then 

terminated.  As  an  example,  consider 
being  applied  to  state^  .  If 
the  opponent  actually  starts  turning,  then  the 
operator  is  corroborated  and 

terminated.  Of  course,  low-level  operators  such  as 
starf-f«rn^pp^„,  are  easy  to  corroborate  in  this 
manner,  since  the  actions  they  model  are  directly 
observ^Ie.  Others,  however,  may  not  generate 
low-Ievel  actions  that  are  directly  observable.  One 
category  of  such  operators  are  the  higher  level 
operators  like  employ-missile^^^^^^,  which 
consists  of  a  number  of  low-level  actions.  This 
issue  will  be  discussed  below. 

This  technique  of  event  tracking,  where  an  agent 
models  another  by  pretending  to  be  in  that  agent’s 
position,  has  been  previously  used  in  automated 
tutoring  systems  [1,  19].  These  tutoring  systems 
need  the  ability  to  model  the  actions  of  the 
students  being  tutored.  For  this,  these  systems  use 
student-centered  problem  spaces  where  states  and 
operators  model  the  students  under  scrutiny.  This 
technique  of  modeling  the  student  is  referr^  to  as 
model  tracing.  The  approach  proposed  liere  for 
event  tracking  is  thus  based  on  this  model  tracing 
work.  However,  there  are  some  signiHcant 
differences.  For  instance,  previous  work  has 
primarily  focused  on  static,  single-agent 
environments,  where  the  agent  being  modeled  is 
the  only  one  causing  changes  in  the 
environment  [10].  There  are  some  other 

differences  as  well.  However,  before  exploring  the 
impact  of  these  differences,  it  is  useful  to  first 
understand  in  detail  how  A^  can  perform  event 
tracking  using  its  opponent-centered  problem 
spaces.  This  is  explained  below  using  the  example 
from  Figure  1.  While  this  explanation  does  not 
directly  describe  the  operation  of  an  actual 
implementation,  it  is  based  on  an  actual 
implementation  that  will  be  described  in  Section 
3.4.  Basically,  the  description  presented  here  will 
be  used  to  motivate  some  representational 
modification  leading  up  to  the  implementation 
described  in  Section  3.4. 

Consider  the  situation  in  Figure  1-a.  In  this  case, 
Ag  models  the  opponent’s  operator  hierarchy  as 
shown  in  Figure  4-a.  A^  is  seen  to  accurately 
model  this  goal  hierarchy,  and  in  particular  without 
any  ambiguity  about  what  actions  the  opponent  is 
exactly  engaged  in.  This  is  again  a  simplifying 


assumption,  and  we  will  return  to  it  in  Section  3.3. 
Figure  4-b  shows  A^’s  own  operator  hierarchy 
corresponding  to  the  situation  in  Figure  1-a.  We 
assume  that  A^  dovetails  the  execution  of  these 
operator  hierarchies,  communicating  important 
relevant  information  from  one  to  the  other. 
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Figure  4:  (a)  A  model  of  opponent’s  operator  hierarchy, 
and  (b)  A^’s  own  operator  hierarchy. 

Consider  the  model  of  the  opponent’s  operator 
hierarchy  from  Figure  4-a.  One  of  the  operators  in 
this  hierarchy  is  final-missile-maneuvers 
which  models  the  opponent’s  final  mtssile- 
launching  behavior.  This  is  a  high-level  operator, 
and  its  expectations  cannot  be  directly 

corroborated  by  observation.  This  operator  is  seen 
to  generate  a  subgoal,  where  the  first  operator  is 
achieve-attack-heading^^^^^y  This  would  require 
a  ^fo'’f'f“^opponcnt  op®™^or  to  turn  to  attack¬ 
heading.  In  Figure  1-a,  attack  heading  is  achieved, 
and  statCopponent  fact.  Hence, 

stop-turn is  being  modeled  as  the  current 
operator,  to  model  the  opponent’s  stopping  her 
turn  at  attack-heading. 

Now  consider  A^,’s  own  operator  hierarchy  in 
Figure  4-b.  A,,  is  attempting  to  get  into  position  to 
fire  its  own  missile  using  the  achieve-proximityg^j^ 
operator  in  the  GET-MISSILE-LAR  problem 
space.  When  the  situation  changes  from  Figure  1-a 
to  Figure  l-b,  A^,  selects  the  cut-to-ls^^^^  operator 
in  place  of  the  achieve-proximity^^^^  operator  in 
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the  GET-MISSILE-LAR  problem  space.  This 
operator  is  intended  to  increase  the  lateral 
separation  between  the  two  aircraft.^  The 
cut-to-ls^^^  operator  causes  A^j  to  turn  its  own 
aircraft  as  shown  in  Figure  1-b.  As  the  aircraft 
turns  to  a  particular  heading,  this  new  heading  is 
modeled  in  state^j^^.  Thus  the  cut-to-ls^^ 
operator  leads  to  indirect  modification  of  state^^. 

This  change  in  state^^^  has  to  be  communicate 
to  statCpppp^m,.  to  update  A^^’s  heading  in 
stateopponcm*  further  modification  in 

statCopponcm*  indicating  that  the  opponent’s  attack 
heading  is  no  longer  achieved.  Based  on  this 
modification,  achieve-attack-heading^^,^^^^  is  re¬ 
activated  (or  re-applied).  This  operator  again 
subgoals  into  the  DESIRED-MANUEVER 
problem  space  where  the  start-tum^^^^^, 
operator  is  reapplied.  When  the  oppxinent  starts 
turning,  this  operator  is  corroborated  and 
terminated.  The  next  operator  in  this  problem 
space  is  stop-tuming^^^^.  When  the  oppxment 
actually  stops  turning  after  reaching  attack 
heading,  as  shown  in  Figure  1-c,  state^^p^^^,  is 
modified  to  indicate  that  opponent’s  attack¬ 
heading  is  achieved,  and  hence 
operator  is  corroborated.  The 
change  in  heading  in  state^p^^,  needs  to  be 
communicated  back  to  state^^^,  so  that  A^  may 
readjust  its  heading  in  cut-to-h^^  if  required. 

Continuing  with  Figure  1-c,  the  opponent’s 
achievement  of  attack-heading  also  corroborates 
the  achieve-attack-heading^^j^^^  operator,  which 
is  now  terminated.  A  new  operator  from  the 
FINAL-MISSILE-MANEUVERS  problem  space 
—  push-fire-button^^^^^  —  is  now  applied. 
This  operator  predicts  a  missile  firing,  but  it  is 
known  that  that  cannot  be  observed.  Hence, 
push-fire-button is  terminated  even  though 
there  is  no  direct  observation  to  support  that 
termination.  However,  the  resulting  missile  firing 
is  marked  as  not  being  highly  likely.  Nonetheless, 
this  missile  launch,  even  with  its  low  likelihood,  is 
communicated  to  state^j^^,  so  that  \  may  react  to 
it  (for  instance  if  A^’s  mission  forbids  it  from 
taking  any  risks  at  all).  At  this  point,  given  the 
termination  of  the  push-fire-burtonQpp^^^^ 
oporator,  opponent’s 


^Lateral  separation  is  defined  as  the  perpendicular  distance 
between  the  line  of  flight  of  A^’s  aircraft  and  the  position  of  its 
opponent  When  the  two  aircr^t  are  pointing  right  at  each  other 
as  in  Figure  I-a,  there  is  no  lateral  separation  between  the  two 
aircraft  Increasing  lateral  separation  provides  a  positional 
advantage. 


final-missile-maneuvers^^^^^  ojjerators  is 
corroborated  and  terminated.  Following  that,  an 
Fpole^jppjjjjgjjj  operator  in  the  EMPLOY-MISSHJE 
problem  space  predicts  an  Fpole  turn.  This  again 
generates  a  subgoal,  back  into  the  DESIRED- 
MANEUVER  problem  space  and  the 
jrarf-mm^jppQ^j  operator  is  reapplied.  When  the 
oppx>nent  executes  her  Fpwle  turn  in  Figure  1-d,  the 
PpolCopponent  operator  is  corroborated  and 
terminate.  At  this  point,  all  of  the  exptectations 
for  the  high-level  employ-missilejjpp^j^j  operator 
are  corroborated;  and  hence  state.  is 

modified  to  indicate  that  a  missile  launch  is  highly 
likely.  These  changes  in  state^^pp^j^j  —  the  change 
in  the  oppx)nent’s  heading  and  the  highly  likely 
status  of  the  missile  launch  —  are  once  again 
communicated  to  state^^^.  Based  on  the  high 
likelihood  of  the  missile  launch,  A^  activates  the 
operator  missile-evasion^^^  to  evade  the  incoming 
missile  (Figure  I-e).  This  change  in  A^’s  heading 
is  once  again  communicated  back  to  state^pp^^^j. 

Thus,  Ag  executes  its  own  operators,  and  tracks 
opp>onent’s  actions  and  behaviors  using  the 
operator^  ,  and  sUte^  This  can  help 

Ag  to  track  its  opponent’s  behaviors,  and  address 
all  of  the  constraints  on  event  tracking  outlined  in 
Section  2.  However,  there  are  some  important 
issues  involved  in  addressing  our  earlier 
constraints  with  this  approach.  There  are  also 
some  simplifying  assumptions  that  we  made  in 
illustrating  event  tracking;  (i)  A^  and  its  opponent 
are  identical;  (ii)  A^  pierforms  all  of  the  complex 
computation  that  is  necessary  to  accurately  model 
opponent’s  state;  and  (iii)  A^  can  accurately  model 
opponent’s  oporator  hierarchy  without  any 
ambiguity.  Relaxing  these  assumptions  leads  to 
some  additional  issues,  which  also  relate  to  the 
constraints  on  event  tracking.  These  issues  are  all 
discussed  in  the  next  Section. 


33.  Addressing  Constraints  on  Event 
Tracking 

TTie  first  constraint  on  event  tracking  was  for  an 
agent  to  track  highly  flexible  and  reactive 
behaviors  of  its  opponent,  while  taking  appropriate 
agent  interactions  into  account.  TTie  use  of 
opponent-centered  problem  spaces  with 
operator„pp^„^,  and  state„ppp„^„,  helps  in  partly 
addressing  this  constraint  (this  was  the  motivation 
behind  this  approach  to  begin  with).  In  particular, 
op3crator,,p^„^„,  can  be  activated  and  terminated  in 
the  same  flexible  manner  as  op)erator„,^„.  There  is 
complete  uniformity  in  the  treatment  of  the  two 
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types  of  operators. 

However,  these  opponent-centered  problem 
spaces  by  themselves  do  not  address  the  issue  of 
modeling  the  interactions  among  the  different 
agents.  In  particular,  the  method  outlined  in 
Section  3.2  requires  building  one  operator 
hierarchy  for  A^,  and  one  for  each  opponent,  with 
their  own  ,  lobal  states.  This  leads  to  a  situation 
where  multiple  compartmentalized  operator 
hierarchies  with  their  own  global  sutes  are 
geiterated.  Modeling  the  strong  agent  interactions 
present  in  this  domain  requires  passing  messages 
from  one  compartment  to  another.  For  instance,  as 
described  above,  when  changes  heading,  that 
information  needs  to  be  propagated  from  state^^^ 
to  state^ppQ^^^.  Similarly,  when  the  opponent  fires 
a  missile  that  information  has  to  be  communicated 
to  stote^^„  from  state  Similarly,  if  A^  is  to 

take  some  action  depending  on  whether  the 
intercept^^^^„,  operator  is  being  executed,  then 
that  information  would  need  to  be  propagated  to 
Aji’s  compartment. 

Given  the  level  of  interactions  among  A^  and  its 
opponents,  this  message  passing  can  be  a 
substantial  overhead.  Furthermore,  there  can  be 
many  aircraft  involved  in  the  combat,  leading  to  an 
increase  in  the  message  passing  overheads.  This  is 
particularly  problematical  given  the  second 
constraint  on  event  tracking  (of  real-time 
performance)  and  the  fourth  constraint  (which 
implies  continuous  agent  interactions). 
Additionally,  the  communication  among  the 
different  compartments  essentially  duplicates  the 
information  of  one  compartment  in  another.  For 
instance,  when  a  missile  is  fired,  this  information 
is  duplicated  in  different  compartments.  Such 
duplication  is  problematical  in  terms  of 
maintaining  its  consistency.  If  a  missile  is  removed 
from  one  compartment,  it  must  be  removed  from 
all  of  the  others. 

The  solution  we  are  investigating  to  alleviate  the 
problem  with  this  compartmentalization  is  to 
merge  the  different  operator  hierarchies  for  the 
different  agents  into  a  single  compartment,  which 
we  will  refer  to  as  world-centered  problem  space 
(WCPS  for  short).  WCPS  eliminates  the 
boundaries  between  different  self-centered  and 
opponent-centered  problem  spaces.  Instead,  the 
different  operator  hierarchies  are  maintained 
within  the  context  of  a  single  WCPS.  There  is  also 
a  single  world  state.  Ihis  state  includes  Aj,’s  own 
problem-solving  state  (state^^^),  A^’s  model  of  the 
state  of  its  opponent  (state^ppo^^f),  as  well  as  A^’s 
model  of  the  states  of  other  entities,  including 
other  opponents  or  friendlies  in  the  world. 


WCPS  eliminates  the  need  for  passing  messages 
to  model  interactions,  instead,  interactions  get 
modeled  in  terms  of  changes  to  the  single  global 
state.  Operator^,^„  and  operator^pp^^„,  are 
directly  able  to  reference  this  global  state  as  well 
as  other  operators.  Furthermore,  the  problem  of 
duplication  of  information  is  avoided.  For  instance, 
a  missile  fired  by  the  opponent  gets  modeled 
within  this  single  global  state  as  a  single  missile. 
Operator  hierarchies  modeling  all  of  the  different 
agents  can  directly  react  to  this  missile. 

An  additional  benefit  of  the  single  global  state  in 
WCPS  also  relates  to  one  of  the  assumptions 
mentioned  in  Section  3.2.  In  particular,  A^  need 
not  perform  all  of  the  complex  computation 
required  in  modeling  opponent’s  state,  but  instead 
it  may  ”re-use"  some  of  the  computation.  Consider 
the  example  of  the  computation  of  "angle  off 
from  the  opponent’s  perspective,  as  mentioned  in 
Section  3.2.  With  the  global  state  in  WCPS,  A^ 
does  not  need  to  recompute  this  "angle  off. 
Instead,  this  is  automatically  computed  in  A^’s 
state^^^,  and  this  can  simply  be  reused.  In 
particular,  A^^’s  state^^^  already  maintains  the 
computation  of  "target  aspect"  from  its  own 
perspective  (the  angle  between  the  opponent’s 
flight  path  and  A^’s  position).  This  is  precisely  the 
definition  of  "angle  off  the  opponent’s 
perspective.  Thus,  instead  of  computing  the  "angle 
off  from  the  opponent’s  perspective  and  "target 
aspect"  from  A^j’s  perspective  separately,  a  single 
computation  can  be  performed  and  used  for  both 
purposes.  Of  course,  not  all  of  the  complex 
computation  involved  in  generating  the  opponent’s 
state  can  be  avoided  in  this  manner.  The 
interesting  research  question  then  is  determining 
what  portion  can  be  re-used  in  this  manner,  and 
how  much  extra  computat’on  is  really  necessary. 

This  shift  from  small  self-centered  d 
opponent-centered  problem-spaces  to  WCPS 
related  to  the  objective  framework  used  in 
simulation  and  analysis  of  DAI  systems  [5],  which 
describes  the  essential,  "real"  situation  in  the 
world.  However,  the  focus  of  our  work  is  on  an 
individual  agent  using  its  world^entered  model  for 
event-tracking.  While  this  model  introduces  a  shift 
towards  an  objective  point  of  view,  by  definition,  it 
is  an  agent’s  subjective  view  of  its  environment, 
and  may  contain  approximations  in 
operator^pp„„^„,  and  state^p^„^,.3 


^Note  that  if  the  agents  do  not  interact,  then  a  single  WCPS 
may  not  be  appropriate,  and  separate  problem  spaces  may  be  the 
right  choice  for  modeling  them. 
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The  second  constraint  on  event  tracking  relates 
to  AgS  ability  to  track  events  in  real-time.  The 
key  impact  of  this  decision  is  on  generating  an 
accurate  and  unambiguous  op®'‘®*0'’opponcnt 
hierarchy  —  one  of  the  assumptions  made  in  the 
previous  section.  In  particular,  this  constrains  the 
methods  can  employ  in  attempting  to  generate 
an  accurate  and  unambiguous  operator  hierarchy. 
For  instance.  Ward  [19]  presents  one  general 
method  for  generating  an  unambiguous  operator 
hierarchy.  This  method  involves  an  exhaustive 
search  over  all  possible  operator  applications  until 
the  one  that  creates  the  right  expectations,  i.e.,  one 
that  matches  the  opponent’s  current  actions,  is 
created.  If  there  is  more  than  one  such  operator 
application,  then  one  is  chosen  randomly.  A 
wrong  choice  can  be  made  in  such  situations. 
However,  as  soon  as  that  is  discovered,  another 
exhaustive  search  can  be  performed.  Given  the 
real-time  constraint  on  event  tracking,  this  type  of 
exhaustive  search  strategy  can  not  be  applied. 
While  Ward  suggests  some  heuristics  to  constrain 
the  search,  this  remains  a  difficult  problem.  The 
WCPS  approach  at  least  provides  a  partial  answer 
here.  In  particular,  given  the  uniformity  among 
operator^  and  operator^ppon^,  in  WCPS,  the 
mechanism  employed  in  resolving  ambiguity  in 
operator^j^^  operators  —  search  control  rules  — 
can  also  be  used  in  resolving  ambiguity  in 
operator^ipp^^,.  Besides  search  control  rules, 
another  possibility  for  resolving  ambiguity  in 
WCPS  is  to  generate  the  goal  hierarchy  bottom-up 
rather  than  top-down.  While  both  of  these  are 
powerful  tools  in  WCPS,  their  advantages  and 
disadvantages  in  this  context  are  not  yet  well 
understood.  , 

The  real-time  constraint  also  raises  the  issue  of 
abstractions  in  event  tracking.  In  particular.  Hill 
and  Johnson  [10]  have  recently  argued  that 
tracking  an  individual  agent’s  actions  in  detail  in  a 
d3mamic  environment  may  prove  computationally 
intractable.  They  advocate  detailed  tracking  only 
where  necessary,  and  reliance  on  abstractions 
elsewhere.  In  WCPS,  abstractions  in  modeling  an 
operator  would  imply  that  detailed  subgoals  for 
modeling  that  operator  need  not  be  generated.  For 
instance,  A^  may  not  model  the  detailed  operators 
used  in  accomplishing  ge/-wu55//e-/<ir,^^„,. 
Thus,  when  get-missUe-lar,^„^^^  is  activate,  it 
may  not  lead  to  any  subgo^.  However,  when  the 
opponent  actually  reaches  the  LAR  (missile  firing 
position),  ger-missile-hirgppQ^,  can  be  considered 
as  corroborated  and  terminated.  Unfortunately,  this 
method  of  abstract  modeling  may  not  be 
appropriate  for  corroborating  an  operator  such  as 


employ-missile  which  involves  multiple 

maneuvers.  In  tnis  case,  the  intermediate  headings 
of  opponent's  aircraft  may  be  important  and  just 
testing  the  terminating  position  may  be  an 
inappropriate  test  for  corroboration.  Automatic 
generation  of  the  right  levels  of  abstraction  is  an 
interesting  issue  for  future  woilc. 

The  third  constraint  on  event  tracking  was  the 
generation  of  expectations  for  an  unseen,  but  on¬ 
going  event.  In  WCPS,  the  application  of  an 
operator^jpp^j^^,  in  essence  is  the  expectation  for 
the  opponent  to  execute  a  certain  plan  or  action. 
Thus,  this  constraint  can  be  addressed  in  a 
straightforward  manner.  However,  since  the  event 
is  unseen,  there  can  be  no  corroboration  of  it.  One 
possibility  to  deal  with  this  situation  is  to  terminate 
operator^jpp^^^,  if  the  relevant  action  is  known  to 
be  unobservable  (for  instance,  since  the  opponent’s 
aircraft  is  not  observable  on  radar). 

The  fourth  constraint  is  related  to  the  continuous 
nature  of  event  tracking.  The  main  implication  of 
this  constraint  is  the  continuous  interaction  among 
agents,  which  as  discussed  above,  leads  to  the 
move  towards  WCPS. 

There  were  also  three  assumptions  made  in  the 
previous  section  to  simplify  event  tracking.  The 
second  and  the  third  assumption,  related  to 
modeling  of  the  opponent’s  state  and  operator 
hierarchy  have  been  discussed  above.  However, 
the  first  one  of  the  assumptions  has  not  been 
discussed.  This  assumption  is  that  the  automated 
pilot  agent  A^  and  its  opponent  are  identical.  The 
key  implication  of  this  assumption  is  that  A^  can 
create  a  copy  of  its  own  operator  and  problem 
space  hierarchy  to  model  the  opponent.  (This 
creation  of  a  copy  by  itself  may  not  be 
straightforward  if  all  of  A^’s  knowledge  is 
essentially  procedural.)  'This  assumption 
essentially  substitutes  for  another  assumption  in 
the  plan  recognition  literature:  the  agent  that  is 
recognizing  a  plan  is  assumed  to  have  full 
knowledge  of  all  of  the  plans  that  the  planning 
agent  can  execute  [12].  If  A^  has  such  additional 
knowledge  about  how  its  opponent’s  plans  or 
operators,  and  how  those  differ  differ  from  its  own, 
then  Ag  need  the  ability  to  interleave  those  with  its 
own  copy  of  operators  while  tracking  opponent’s 
behaviors.  If  A^  does  not  have  this  additional 
knowledge,  then  A^  will  need  to  model  its 
opponent  with  incomplete  information,  or  to  learn 
that  information  from  observation  of  the 
opponent's  actions  or  by  some  other  means. 
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3.4.  A  Prototype  WCPS-based  Agent 

An  important  test  of  the  WCPS  model  is  its 
actual  application  in  a  dynamic,  multi-agent 
environment.  The  task  of  developing  an  automated 
pilot  for  the  air-combat  simulation  domain  is 
tailor-made  for  this  test.  The  development  of 
automated  pilots  in  this  domain  is  currently  based 
on  a  system  called  TacAir-Soar  [11,17],  which  as 
mentioned  earlier,  is  developed  using  the  Soar 
integrated  problem-solving  and  learning 
architecture.  TacAir-Soar  is  a  "non-trivial"  system 
that  includes  about  800  rules.^  Its  original  self- 
centered  problem  space  design  worked  against  an 
initial  inactive  opponent.  However,  it  very  quickly 
failed  against  an  active  opponent  —  there  was  a 
need  for  tracking  events  related  to  actions  of  the 
other  agents. 

To  survive  in  this  real-time  environment,  the 
system  was  forced  to  employ  world-centered 
problem  spaces.  However,  these  world-centered 
problem-spaces  are  created  based  on  an  incomplete 
and  ad-hoc  mechanism,  that  suffers  from  three 
problems.  First,  event  tracking  is  not  robust, 
meaning  the  automated  pilot  agent  can  and  does 
generate  unuseful  or  misleading  interpretations  for 
key  opponent  actions,  such  as  the  opponent's  turn 
in  Figure  I-c.  This  lack  of  robustness  also  implies 
that  the  automated  pilot  is  unable  to  deal  with 
sensor  limitations  effectively.  Thus,  sometimes  if 
radar  contact  is  momentarily  lost,  the  agent  may 
not  track  the  opponent’s  actions.  A  second 
problem  with  the  existing  world-centered  problem 
spaces  is  that  event  tracking  does  not  generate 
expectations.  A  third  problem  is  that  the  agent's 
real-time  response  can  suffer  due  to  sequential 
operator  execution. 

We  have  implemented  a  variant  of  TacAir-Soar 
that  is  fully  based  on  WCPS.  To  create  this 
variant,  we  started  with  the  operators  and  problem 
spaces  that  are  used  by  a  TacAir-Soar-based 
automated  pilot  in  generating  its  flexible  actions 
and  behaviors.  We  then  generated  a  copy  of  these 
operators  and  problem  spaces  to  model  the 
automated  pilot’s  opponent  within  a  single  WCPS. 
This  copy  was  hand  generated  (since  most  of 
TacAir-Soar’s  knowledge  is  procedural,  automatic 
generation  of  such  a  copy  is  an  interesting  research 
question  that  is  left  for  future  work).  In  generating 
diis  copy,  some  of  TacAir-Soar’s  operators  and 
problem  spaces  were  abstracted  away  —  these 
opponent  actions  were  not  modeled  in  detail.  The 


^ince  the  completion  of  the  experiment  described  in  this 
section,  the  size  of  the  TacAir-Soar  system  has  grown  to  about 
ISOO  rules. 


result  is  an  implementation  that  is  able  to  track 
events  while  generating  expectations.  It  is  also 
promising  in  terms  of  being  more  robust  in 
tracking  events.  The  implementation  tracks 
opponent’s  action  and  behavior  as  described 
provided  in  Section  3.2.  Simultaneously,  as 
discussed  in  Section  3.3,  it  avoids  the 
communication  overheads  and  duplication  of 
information.  The  implementation  currently  only 
works  in  single  opponent  situation.  Work  on 
extending  the  implementation  to  multiple  opponent 
situations  is  currently  in  progress. 


4.  Summary 

This  paper  makes  two  contributions.  First,  it 
presents  a  detailed  analysis  of  event  tracking  in  the 
"real-world",  dynamic,  multi-agent  environment  of 
air-combat  simulation.  This  analysis  reveals 
interesting  issues  that  represent  a  novel 
intersection  of  the  areas  of  plan  recognition  and 
DAI.  Tools  ‘snd  techniques  that  have  emerged  from 
single-agent  environments  are  inadequate  to 
address  these  issues.  The  second  contribution  of 
the  paper  is  the  idea  of  •  •)rld-centered  problem 
spaces  (WCPS),  for  use  in  general  multi-agent 
situations.  WCPS  is  independent  of  problem 
spaces  as  such  —  the  key  idea  is  that  an  agent 
treats  the  generation  of  its  own  behavior  and 
tracking  of  others  uniformly.  WCPS  was  used  in 
(re)implementing  automated  pilots  for  air-combat 
simulation. 

The  paper  also  outlined  several  unresolved 
issues  in  WCPS.  Among  them,  resolving 
ambiguity  in  opponent’s  actions,  generating 
approximations,  learning  about  the  opponent  from 
observation,  and  so  on.  We  hope  that  addressing 
these  issues  will  help  in  allowing  WCPS  to 
perform  event  tracking  in  a  more  robust  fashion. 
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Abstract 

Testing  and  knowledge  acquisition  have  been  two 
of  the  most  tedious  and  time  consuming  tasks  in 
the  development  of  IFOR  agents  in  the  TacAir- 
Soar  (TAS)  project.  This  paper  presents  some 
suggestions  for  a  human  control  t^l,  similar  to 
a  simple  flight  simulator,  that  can  be  helpful  in 
these  two  areas.  Furthermore,  we  discuss  some 
of  the  design  considerations  and  implementation 
issues  that  are  faced  in  developing  such  a  tool. 
Such  a  tool,  called  the  Human  Instrument  Panel 
(HIP)  has  been  developed  for  use  in  the  TacAir- 
Soar  project.  Some  key  features  of  HIP  are  how 
cheaply  it  has  been  developed,  how  quickly  is  was 
incorporated  into  the  Tac Air-Soar  project,  and 
how  eaaly  it  can  be  adapted  to  similar  domains. 

Introduction 

The  Human  Instrument  Panel  (HIP)  is  a  tool  de¬ 
signed  to  make  the  testing  of  TacAir-Soar  (TAS) 
agents  in  the  ModSAF  simulator  easier  and  less 
tedious.  Additionally  it  has  been  useful  to  a  lesser 
extent  as  an  ud  in  the  lengthy  process  of  knowl¬ 
edge  acquisition.  This  paper  presents  suggestions 
as  to  how  this  type  of  tool  can  be  used  as  an  aid  in 
testing  and  knowledge  acquisition  and  discusses 
some  of  the  design  considerations  that  the  creator 
of  such  a  tool  might  face.  Since  the  Human  In¬ 
strument  Panel  is  meant  to  be  a  time-saving  tool, 
one  major  design  consideration  is  that  the  time 
and  effort  saved  by  HIP  outweigh  the  time  and 
effort  spent  on  development  and  'maintenance. 

The  TscAir-Soar  project[l]  at  the  University 
of  Michigan,  ISI,  and  Carnegie  Mellon  University 
has  combined  Soar,  a  state  of  the  art  artificial 
intelligence  architecture,  and  ModSAF,  a  sophis¬ 
ticated  battlefield  simulator,  to  create  realistic, 
human-like  computer  agents  in  a  beyond  visual 
range  air-to-air  combat  domain.  The  develop¬ 
ment  of  these  agents  can  be  viewed  as  a  repeat^ 
cycle  through  three  pha8es[2]: 


1.  Knowledge  Acquisition 

2.  Implementation 

3.  Testing 

Usually  the  majority  of  the  development  time  in 
the  TacAir-Soar  project  is  spent  in  the  knowledge 
acquisition  and  testing  phases.  It  is  for  this  rea¬ 
son  that  HIP  has  been  developed  specifically  to 
assict  in  these  two  tasks. 

The  main  difficulties  in  knowledge  acquisition 
are  the  vast  amount  of  information  that  must 
be  acquired  and  the  formulation  of  questions  to 
extract  the  most  important  information.  One 
effective  form  of  knowledge  2u:quisition,  which 
sidesteps  the  questions  formulation  difficulty,  is 
to  observe  expert  pilots  as  they  fly  missions  on 
simulators.  This  allows  the  questioner  to  identify 
issues  that  might  never  come  up  in  a  question 
and  answer  session.  Unfortunately,  the  cost  of 
running  the  simulators  and  getting  the  pilots  and 
researchers  to  the  simulator  site  does  not  allow  for 
the  amount  of  free  play  that  would  be  required  to 
cover  the  breadth  of  necessary  knowledge. 

Like  knowledge  acquisition,  one  of  the  difficul¬ 
ties  of  the  testing  phase  is  getting  the  experts, 
researchers,  and  machines  together  so  that  the 
experts  crm  evaluate  the  TacAir-Soar  agents.  An¬ 
other  major  difficulty  is  the  lack  of  a  flexible,  re¬ 
alistic  opponent  against  which  to  test  the  TAS 
agents.  ModSAF  controlled  agents  are  not  suffi¬ 
ciently  intelligent  to  provide  re^istic  challenges  to 
the  TAS  agents  and  there  is  no  support  in  Mod¬ 
SAF  for  direct  human  control  of  agents  (i.e.  no 
flight  simulator  capability).  Until  recently,  test¬ 
ing  a  TAS  agent  in  a  specific  scenario  required 
creating  a  separate  set  of  TAS  agents  as  oppo¬ 
nents  with  hard-coded  missions,  and  then  care¬ 
fully  designing  the  initial  situation  so  that  the 
desired  scenario  would  occur.  This  was  a  time 
consuming  process  and  testing  TAS  agents  only 
against  other  TAS  agents  left  the  possibility  of 
undetected  errors. 

One  obvious  solution  to  these  problems  is  to 
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create  a  tool  whicli  acts  as  a  simple  flight  simula¬ 
tor  interface  to  ModSAF  agents.  This  would  pro¬ 
vide  the  ability  to  observe  free  play  sessions  with¬ 
out  expensive  simulators  (TacAir-Soar  has  been 
tested  against  BATTs’  simulators  at  the  WIS- 
SARD  lab  at  the  Oceana  Naval  base).  Addition¬ 
ally,  the  interface  could  also  serve  as  a  realistic 
opponent  for  TAS  agents,  pointing  out  errors  that 
TAS  vs.  TAS  testing  might  miss.  This  is  exactly 
the  role  that  the  Human  Instrument  Panel  (HIP) 
is  designed  to  fill.  HIP  allows  the  user  to  attach  a 
simple  instrument  panel  to  a  ModSAF  agent  and 
issue  flight  commands  to  that  agent’s  plane. 

One  goal  of  this  paper  is  to  describe  the  wide 
range  of  possible  uses  for  a  human  control  tool 
such  as  HIP  in  the  creation  of  intelligent  forces 
(IFOHs).  Hopefully  the  ways  in  which  we  have 
found  HIP  to  be  useful  will  suggest  techniques 
that  will  make  testing  and  knowl^ge  acquisition 
for  other  IFORs  easier.  Some  of  the  design  con¬ 
siderations  involved  in  creating  human  control 
tools  will  be  discussed  along  with  the  pros  and 
cons  of  the  choices  made  while  developing  HIP. 

The  next  section  of  this  paper  will  describe  the 
potential  uses  for  human  control  tools,  such  as 
HIP,  both  as  testing  tools  and  aids  to  knowledge 
acquisition.  Section  3  will  point  out  some  of  the 
important  design  considerations  and  discuss  the 
advantages  of  various  approaches  while  section  4 
provides  a  quick  description  of  the  various  forms 
of  HIP  that  we  have  developed  (F-14D,  MiG-29, 
E-2C)  including  screen  snapshots  of  HIP  in  ac¬ 
tion. 

Functionality  of  a  Human  Control 
Tbol 

When  we  set  out  to  develop  the  Human  Instru¬ 
ment  Panel  we  were  motivated  by  the  need  for 
a  flexible  and  realistic  opponent  for  TAS  agents. 
As  the  project  progressed  we  came  up  with  many 
more  potential  uses  which  required  only  minor 
additions  to  the  original  specifications.  The  func¬ 
tionality  of  human  control  tools  can  be  divided 
into  two  categories:  testing  aids  and  knowledge 
acquisition  uds.  Since  HIP  was  designed  mainly 
as  a  tool  for  testing  we  will  focus  primarily  on 
its  applications  to  this  phase.  There  are  three 
major  components  of  IFOR  development  in  the 
Tactical  Air  domiun  that  HIP  facilitates:  testing 
in  a  variety  of  situations,  monitoring  an  agent’s 
local  (instrument-level)  behavior  during  testing, 
and  scenario  setup. 

Testing  bitelligent  Agents  HIP  enables 
three  different  types  of  testing  that  are  benefi- 

*The  Bask  Air  Ihctics  TVainer  is  a  medium- 
fidehty  aircraft  simnlator. 


cial  to  the  IFOR  designer.  First,  one  would  like 
to  test  the  performance  of  an  agent  against  a  hu¬ 
man  pilot.  Such  tests  may  involve  determining 
the  response  of  the  agent  to  specific  tactical  situ¬ 
ation  (e.g.,  bogey  approaches  TAS  from  the  right, 
rear  quarter  and  performs  a  set  series  of  maneu¬ 
vers).  With  HIP,  the  researcher  can  take  con¬ 
trol  of  a  plane  and  use  HIP  to  approach  the  TAS 
agent  from  the  specified  quarter  and  perform  the 
required  maneuvers.  The  TAS  agent’s  responses 
to  these  actions  can  be  recorded  for  later  evalua¬ 
tion.  This  is  an  example  of  scripted  testing.  The 
nature  of  the  HIP  interface  also  encourages  free- 
play  testing.  One  can  simply  create  a  TacAir-Soar 
agent,  create  a  HIP  agent,  and  then  fly,  head-to- 
head.  While  on  the  surface  this  may  seem  more 
like  play  than  research,  this  can  lead  to  the  ob¬ 
servation  of  behaviors  not  explicitly  seen  during 
scripted  testing. 

A  second  type  of  testing  that  can  be  accom¬ 
plished  with  HIP  is  an  agent’s  ability  to  coordi¬ 
nate  its  actions  with  other  agents.  Tliis  is  done  by 
allowing  a  human  and  a  TacAir-Soar  agent  to  fly 
together.  Communication  from  HIP  to  the  agent 
is  accomplished  using  a  series  of  pull-down  menus 
that  correspond  to  the  types  and  formats  of  ra¬ 
dio  messages  the  TacAir-Soar  agents  can  send  to 
one  another.  The  advantage  of  testing  this  coor¬ 
dination  with  a  human  pilot  (as  opposed  to  two 
IFORs  interacting)  is  that  specific  tests  can  be 
scripted  that  would  be  difficult  to  carry  out  with 
two  IFORs.  For  example,  consider  the  question 
of  testing  an  agent’s  behavior  when  its  wingman 
is  lost.  Using  an  IFOR  for  this  test  would  require 
implementing  an  agent  that  would  purposely  lose 
its  lead.  Similar  tests  in  the  BATTs  are  impos¬ 
sible  since  there  is  no  communication  interface 
between  that  simulator  and  Soar.  However,  us¬ 
ing  HIP  in  this  test  requires  only  that  the  pilot 
acting  as  the  wingman  fly  away  after  establishing 
a  communication  link  with  the  lead. 

A  third  type  of  testing  facilitated  by  HIP  is  the 
ability  to  monitor  the  response  of  IFOR  agents 
in  situations  with  varying  world  knowledge.  For 
example,  an  agent’s  behavior  toward  a  single  con¬ 
tact  should  probably  be  modified  if  the  agent 
is  informed  of  multiple,  hostile  .contacts  beyond 
radar  visibility.  This  ability  to  control  the  agent’s 
world  knowledge  is  achiev^  in  HIP  by  introduc¬ 
ing  a  new  plane  type,  the  E-2C*.  The  HIP  E- 
2C  can  direct  BRASH  (Bearing-Range-Altitude- 
Speed-Heading)  contact  information  to  TacAir- 
Soar  agents  as  well  as  to  other  HIP  agents  (and, 
conceivably,  to  BATTs  pilots  as  well).  The  HIP 
E-2C  also  is  complementing  the  design  of  an  E- 
2C  TacAir-SoM  agent.  Prior  to  the  implementa- 

’The  E-2C  is  a  prop-driven,  non-combat  plane 
with  a  large  AWACS-style  radar. 
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tion  of  the  HIP  E-2C,  there  was  no  way  to  inform 
ThcAtr-Soar  agents  about  contacts  out  of  their 
radar  range. 

Monitoring  An  Intelligent  Agent's  Behav¬ 
ior  In  ModSAF  each  unit  is  displayed  as  an  icon 
with  associated  heading,  speed,  and  altitude  val¬ 
ues.  This  is  sufficient  for  ob^ving  a  TacAir- 
Soar  agent’s  high  level  behavior  but  it  does  not 
provide  much  information  about  how  realistically 
the  agent  is  flying  nor  any  important  information 
such  as  radar  m^es  and  weapon  selection.  An¬ 
other  possible  use  of  a  human  control  tool  is  to 
"peek  over  TacAir-Soai’s  shoulder”  as  the  TAS 
agent  flies  by  displaying  that  plane’s  instrument 
readouts  while  leaving  the  agent  in  control.  To 
support  this  an  option  was  added  to  HIP  to  at¬ 
tach  the  instrument  panel  to  a  plane  but  sup¬ 
press  the  transmission  of  any  flight  commands 
from  HIP.  This  has  provided  some  useful  feedback 
as  to  how  certain  flight  dynamics  are  handled  in 
ModSAF.  Additionally  expert  pilots  may  find  it 
easier  to  evaluate  TacAir-Soar  agents  from  the  fa¬ 
miliar  (somewhat  realistic)  cockpit  perspective. 

Building  Scenarios  The  ability  to  selectively 
suppress  or  allow  the  transmission  of  flight  com¬ 
mands  from  HIP  could  also  very  useful  when  set¬ 
ting  up  scenarios  to  test  an  agent’s  response  to 
specific  situations.  It  is  often  difficult  to  set  the 
initial  position  of  each  unit  involved  in  the  sce¬ 
nario  so  that  a  desired  encounter  occurs.  With 
the  human  control  tool  it  could  be  possible  to 
take  control  of  each  agent  and  fly  it  into  exactly 
the  position  required  and  then  return  control  to 
TAS.  Once  the  user  has  taken  control  away  from 
TAS,  HIP  does  not  currently  allow  control  to  be 
returned  to  TAS.  This  is  due  to  the  difficulty  of 
keeping  the  TAS  agent’s  internal  state  consistent 
with  the  external  world.  Once  this  problem  is 
overcome,  it  is  possible  that  TAS  could  learn  be¬ 
haviors  by  “observing”  while  a  human  flies  the 
agent’s  plane  and  executes  the  desired  actions. 

Design  Considerations 

There  are  obviously  many  ways  that  a  human 
control  tool  can  be  implemented,  each  with  vari¬ 
ous  advantages  and  disadvantages.  In  this  section 
we  will  discuss  a  few  of  these  implementation  de¬ 
cisions  as  well  as  some  importauit  high-level  design 
considerations. 

High-Level  Considerations  Although  the 
anticipated  uses  of  HIP  drove  its  design,  several 
other  high-level  factors  had  to  be  considered  as 
well.  Four  are  discussed  here:  level  of  detail  in  the 
simulation,  non-invasive  interface  with  the  simu¬ 
lator,  shallow  learning  curve  for  new  users,  and  a 


simple  implementation. 

In  considering  the  level  of  cockpit  detail  for  the 
HIP  interface,  one  key  decision  was  made  which 
affected  the  subsequent  development  of  the  tool. 
The  TacAir-Soar  agents  use  a  cockpit  abstrac- 
<ion[l]  to  interface  with  ModSAF.  The  agents 
send  commands  to  ModSAF  indicating  flight  pa¬ 
rameters  such  as  desired  altitude  and  speed;  Mod¬ 
SAF  includes  functions  to  convert  these  high-level 
commands  into  low-level  flight  surface,  sensor  and 
weapon  controls.  Since  one  of  the  goals  of  the 
HIP  project  was  to  produce  a  realistic  tool  as 
quickly  as  possible,  the  TAS/ModSAF  interface 
was  adopted  for  HIP  as  well.  This  resulted  in 
a  less-re^istic  interface  than  many  flight  simula¬ 
tors  -  there  is  no  joystick  and  commands  are  en¬ 
tered  for  the  desired  heading,  altitude  and  speed 
while  ModSAF  determines  the  appropriate  flight 
response.  This  decision  does  represent  a  compro¬ 
mise  in  simulator  realism.  However,  since  the  do¬ 
main  of  interest  is  tactical  rather  than  low-level 
flight,  this  choice  allowed  much  faste'  develop¬ 
ment  while  not  compromising  HlP’s  anticipated 
uses. 

Because  HIP  was  to  be  used  in  a  simulated  en¬ 
vironment  with  an  unknown  (and  possibly  large) 
number  of  other  agents,  HIP  could  not  zulversely 
affect  the  normal  operation  of  ModSAF;  the 
added  functionality  had  to  be  non-invasive.  Simi¬ 
larly,  HIP  also  had  to  be  transparent  to  ModSAF 
so  that  ModSAF  would  work  normally  if  no  HIP 
agent  were  needed.  The  implementation  decisions 
outlined  below  allowed  HIP  to  achieve  these  re¬ 
quirements.  Finally,  since  HIP  is  usually  run  from 
the  same  workstation  as  the  simulator,  its  display 
had  to  minimize  interference  with  the  ModSAF 
display.  This  drove  the  decision  to  make  the  HIP 
F-14  and  MiG-29  windows  as  small  as  possible. 

Another  factor  in  the  design  of  HIP  was  that  it 
needed  to  be  simple  to  use.  Building  an  interface 
that  required  pilot  skill  would  have  defeated  many 
of  the  functionalities  described  above.  This  de¬ 
sign  constraint  was  met  by  using  a  graphical  inter¬ 
face  built  to  resemble  a  highly  schematic  cockpit. 
Flight  controls  are  modified  using  a  mouse  with 
“click-and-drag”  widgets.  This  allows  a  novice 
-user  to  receive  instruction  on  HIP  and  be  “up 
and  flying”  in  less  than  five  minutes.  Addition¬ 
ally,  demonstrations  and  reviews  can  now  include 
sessions  in  which  on-lookers  can  participate  in  an 
engagement  —  with  or  against  -  sui  IFOR  agent. 

Finally,  the  time  required  to  create  HIP  had 
to  be  considered,  taking  into  account  all  the  de¬ 
sign  goads  and  constraints  mentioned  heretofore. 
HIP  is  intended  as  a  tool  to  aid  IFOR  reseatrch 
and  not  an  end  unto  itself.  The  system  had  to 
be  developed  quickly,  with  a  minimum  impact  on 
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project  personnel.  Additionally,  the  completed 
system  had  to  support  the  addition  of  features 
and  enhancements  without  significant  effort.  The 
basic  implementation  strategy  derives  from  this 
constraint. 

Human  Control  Tool/Simulator  Interface 
One  of  the  first  issues  faced  in  the  development 
of  HIP  was  how  it  would  communicate  with  Mod- 
SAF.  One  possibility  was  to  make  HIP  a  part  of 
the  ModSAF  executable.  The  advantage  of  this 
approach  was  that  communication  could  be  ac¬ 
complished  via  parameters  to  function  calls  and 
would  allow  for  very  fast  transmission  and  as  high 
a  bandwidth  as  necessary.  The  disadvantages 
were  that  making  HIP  a  part  of  the  ModSAF  ap¬ 
plication  would  make  the  executable  larger  and 
slower.  As  mentioned  above  one  of  our  high  level 
design  goals  was  to  limit  any  adverse  effects  HIP 
might  have  on  the  ModSAF  system.  Because  of 
this,  HIP  was  not  incorporated  into  the  ModSAF 
application. 

The  approach  we  chose  was  to  make  HIP  a  sep¬ 
arate  executable  which  communicated  with  Mod¬ 
SAF  through  UNIX  sockets.  A  variety  of  in¬ 
terprocess  communication  packages  would  have 
been  appropriate  but  sockets  were  chosen  because 
working  code  was  available.  The  main  advan¬ 
tage  of  the  separate  process  approach  is  that  HIP 
can  be  run  on  a  separate  machine  and  therefore 
has  very  little  effect  on  the  speed  of  ModSAF. 
Also  the  addition  of  the  HIP  communication  in¬ 
terface  added  only  50  kilobytes  to  the  ModSAF 
executable  while  the  entire  HIP  package  would 
have  added  well  over  a  megabyte.  The  trans¬ 
fer  rate  through  the  sockets  seems  well  able  to 
keep  up  with  ModSAF;  however,  if  this  becomes 
a  problem  more  efficient  interprocess  communica¬ 
tion  techniques  are  avulable. 

Moded  vs.  Unmoded  User  Interface  In  ad¬ 
dition  to  these  requirements,  the  control  structure 
of  the  final  interface  needed  to  be  unmoded.  A 
moded  system  is  simply  one  in  which  certain  ac¬ 
tions  can  be  made  only  when  initiated  by  a  pre¬ 
vious  series  of  actions.  Conversely,  aa  unmoded 
design  allows  most  functionality  to  be  accessed  in¬ 
dependent  of  other  actions.  This  capability  was 
important  for  HIP  since  most  actions  occur  as  a 
response  to  the  current  situation.  For  example,  a 
pilot  may  decide  to  turn  in  response  to  a  number 
of  different  situations:  firing  a  missile,  evading  an 
approaching  missile,  receiving  an  order  from  the 
lead  or  taking  advantage  of  the  tactical  situation. 
Thus,  it  is  important  that  a  HIP  pilot  be  able  to 
turn  at  any  time.  Such  behavior  is  supported  by 
hip’s  unmoded  design.  Specifically,  all  controls 
in  the  HIP  interface  can  be  accessed  at  any  time. 
Controls  that  require  a  series  of  steps  (e.g.,  load¬ 


ing  and  then  firing  a  weapon)  must  be  ordered  by 
the  HIP  pilot.  Therefore,  pressing  the  fire  button 
has  no  effect  when  a  missile  is  not  loaded.  Such 
an  umoded  design  is  consistent  with  an  actual 
cockpit  and  adds  to  HIP’s  ease-of-use. 

Quick  and  Cheap  Development  The  HIP 
interface  required  a  sophistication  in  computer 
graphics  that  would  have  required  either  expen¬ 
sive  consultation  or  time  for  the  designers  to 
learn  such  sophistication.  However,  there  are 
many  software  packages  available  that  dlow  the 
design  of  user  interfaces  at  the  “widget”  level 
rather  than  the  pixel  level  of  most  computer 
graphics  programming.  A  widget  is  simply  a 
pre-defined  graphics  component  with  a  specific 
functionality  such  as  a  menu  or  slide-bar.  Ex¬ 
amples  of  such  widget  design  packages  include 
X-Designer[3]  (Imperial  Software  Technologies), 
Builder  Xcessory[4]  (Integrated  Computer  Solu¬ 
tions),  and  the  Simple  User  Interface  Toolkit[5]or 
SUIT  (University  of  Virginia).  SUIT  was  chosen 
for  this  project  because  it  was  available  to  the  uni¬ 
versity  free-of-charge  and  it  included  the  follow¬ 
ing  needed  features:  a  reasonable  assortment  of 
widgets,  the  ability  to  design  widgets  with  specific 
functionality,  and  good  documentation  backed  by 
a  large  user  group. 

The  ability  to  create  user-defined  widgets  was 
particularly  important.  For  example,  the  first  im¬ 
plementation  of  the  radar  display  contained  only 
textual  information  and  proved  difficult  to  use. 
This  information  was  encoded  in  the  subsequent 
design  of  the  graphical  radar  display.  The  color 
of  a  particular  radar  contact  is  used  to  represent 
the  classification  of  an  agent  as  friendly,  enemy 
or  unknown.  The  shape  of  the  contact  deter¬ 
mines  if  the  contact  is  held  via  radar,  visual  or 
both.  A  vector  from  the  contact  gives  relative 
heading  and  speed  information.  This  display  has 
proven  simple  to  use,  conveying  a  great  deal  of 
information  via  this  customization  of  the  widget. 
Additionally,  the  capability  to  select  targets  by 
clicking  on  them  in  the  radar  display  was  added. 
This  removed  the  necessity  of  identifying  agents 
by  call  sign  or  vehicle-id  when  targeting.  Other 
user-defined  widgets  include  the  heading  display 
and  the  HUD.  These  widgets  increase  both  the 
functionality  and  usability  of  the  interface  but, 
because  SUIT  supports  the  design  of  such  wid¬ 
gets,  does  not  require  programming  at  the  pixel 
level  for  such  increased  capability. 

The  first  implementation  of  HIP  (for  the  F- 
14)  was  done  by  two  graduate  students,  working 
part-time  over  the  course  of  a  semester  (approx¬ 
imately  three  effort-months).  This  included  the 
full  functionality  of  the  agent  described  in  Sec¬ 
tion  4  as  well  as  researching  the  capabilities  and 


payloads  of  the  actual  aircraft[6,7].  Modifying 
HIP  for  a  similar  plane  (the  MiG-29)  took  about 
forty-five  minutes  of  actual  coding  and  about  a 
day  to  test  after  researching  the  appropriate  flight 
and  weapons  parameters.  Finally,  incorporating  a 
completely  different  plane-type,  with  both  added 
features  and  a  different  functionality  (the  G-2C), 
took  only  10  hours  of  total  effort,  again  after  the 
appropriate  research  had  been  completed.  These 
efforts  showed  that  our  design  goals  had  been 
met:  a  tool  had  been  developed  quickly  that  could 
be  used  in  number  of  ways  and  that  did  not  inter¬ 
fere  with  the  simulator.  Additionally,  the  design 
criteria  allowed  the  interface  to  be  modified  very 
quickly  for  application  to  slightly  different  agents 
in  the  same  domain. 

Description  of  HIP 

The  F-14  version  was  the  first  HIP  version  con¬ 
structed  (see  Figure  1).  The  display  is  divided 
into  three  sections.  The  left-most  section  includes 
the  communications  interface  and  widgets  for  se¬ 
lecting  and  deleting  HIP  agents  (HIP  may  be  used 
to  control  more  than  one  F-14  at  a  time).  Com¬ 
munications  is  accomplished  via  a  series  of  win¬ 
dows  that  represent  message  templates.  For  ex¬ 
ample,  when  the  "Current  Position”  message  is 
selected,  HIP  automatically  Alls  in  that  informa¬ 
tion  in  the  template.  Messages  not  correspond¬ 
ing  to  the  template  messages  can  also  be  entered. 
In  the  center  of  the  display  are  the  flight  con¬ 
trols  as  well  as  buttons  for  leasing  control  of  an 
agent  from  ModSAF  (toggling  the  transmission 
of  commands  to  ModSAF)  and  quitting  ModSAF. 
There  is  also  a  simple  Hea^  Up  Display  (HUD)  in 
the  center  of  the  window.  The  flight  controls  for 
heading,  altitude  and  speed  include  both  the  cur¬ 
rent  value  (given  by  the  large,  filled  arrows)  and 
the  desired  value  (the  small  arrow  in  the  heading 
display,  the  position  of  the  sliders  for  altitude  and 
speed).  Finally,  the  right-most  section  of  the  HIP 
window  consists  of  the  radar  display,  radar  con¬ 
trols,  and  weapons  controls.  As  mentioned  pre¬ 
viously,  targeting  simply  consists  of  clicking  on 
contacts  in  the  radar  display. 

Figure  2  is  an  example  of  the  HIP  E-2C  dis¬ 
play.  The  reduced-function  flight  controls  have 
bera  placed  to  the  right  and  the  radar  display 
enlarged  and  moved  to  the  center.  Weapons  con¬ 
trols  have  been  deleted  and  a  new  window  created 
for  displaying  the  BRASH  of  the  current  contact- 
of-interest  (COI).  This  information  is  generated 
automatically  when  BRASH  information  is  "ra¬ 
dioed”  to  a  TacAir-Soar  agent.  BRASH  informa¬ 
tion  can  be  generated  with  respect  to  either  the 
position  of  the  E-2C  or  another  agent.  The  HIP 
E-2C,  with  both  a  different  display  and  different 
functionality  from  the  HIP  F-14  agent,  was  cre¬ 


ated  by  utilizing  the  basic,  generic  structure  that 
was  purposely  used  in  building  the  F-14  agent. 

Flying  in  HIP  simply  requires  entering  the 
name  of  an  appropriate  agent  from  ModSAF  in 
the  Select  Plane  text  box,  setting  the  desired 
starting  configuration,  and  then  hitting  the  Take 
Control  button.  Once  control  has  been  estab¬ 
lished  the  user  can  then  hit  the  button  again  to 
Release  Control.  As  mentioned  previously,  the 
user’s  activity  while  flying  is  unmoded  and  any 
action  can  be  taken  immediately  in  response  to 
the  user’s  evaluation  of  the  current  scenario. 

Although  the  flight  controls  are  rudimentary, 
experimentation  with  HIP  has  shown  that  sophis¬ 
ticated  maneuvers  can  be  accomplished.  How¬ 
ever,  in  addition  to  these  maneuvers,  inexperi¬ 
enced  pilots  also  often  make  mistakes.  One  of  the 
most  obvious  is  turn  too  hard,  too  often,  resulting 
in  stalls.  Stalls  may  be  recovered  by  diving  hard 
until  speed  increases  sufficiently  for  re-engaging 
the  engines.  What  is  interesting  about  these  ma¬ 
neuvers  is  that  by  just  using  HIP  and  getting  a 
"feel”  for  flying,  TAS  designers  have  become  more 
comfortable  with  the  problem  domain  suid  have 
guned  insights  into  many  of  its  features  and  limi¬ 
tations.  This  has  proved  an  unanticipated  benefit 
of  HIP  but  one  that  is  proving  useful,  especially 
as  the  agents  are  modeled  at  increasing  levels  of 
detail. 

Conclusion 

While  the  examples  in  this  paper  have  concen¬ 
trated  on  the  air  to  air  combat  domain,  human 
control  tools  can  be  easily  transferred  to  other 
domains  where  a  similar  interaction  is  desired 
between  human  and  computer  generated  forces. 
The  near-immediate  extensions  to  HIP  for  the 
MiG-29  and  E-2C  demonstrate  the  extensibility 
of  the  basic  tool.  In  the  future  we  hope  to  ex¬ 
tend  HIP  by  creating  versions  for  close  air  sup¬ 
port  units,  ground  forces  and  other  vehicles  sup¬ 
ported  by  ModSAF  by  utilizing  the  underlying 
ModSAF  functions  for  low-level  agent  behavior. 
Thus,  having  invested  in  the  implementation  of 
the  basic  tool,  application  to  different  domains  is 
considerably  simplified. 

This  paper  has  discussed  some  of  the  decisions 
appropriate  for  developing  a  simple  interface  for 
human  interaction  with  intelligent  forces.  These 
decisions  were  constrauned  by  the  following  ques¬ 
tions: 

•  What  functions  should  be  supported  by  the 

tool? 

•  How  much  effort  can  be  invested  in  tool  devel¬ 
opment? 

•  What  tools  atre  available  to  make  development 
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Figure  1;  The  HIP  F-14  Instrument  Panel. 


Figure  2:  The  EI-2C  Radar  Controller  Window  and  Flight  Controls. 
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quick,  inexpensive  and  robust? 

In  attempting  to  explore  the  trade-offs  and  ex¬ 
plain  the  rationale  behind  our  answers  we  hope  we 
have  provided  a  motivation  and  framework  for  the 
development  of  similar  tools.  We  have  presented 
the  Human  Instrument  Panel  as  one  example  of 
such  a  tool  and  described  a  wide  variety  of  ways 
in  which  such  a  human  control  tool  can  aid  in  the 
testing  and  knowledge  acquisition  necessary  for 
any  IFOR  project. 
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SUIT  Users  Group. 
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